Data Infra Engineer

Posted 21 Days Ago
Mountain View, CA
Hybrid
Senior level
Artificial Intelligence • Cloud • Machine Learning • Software • Database
The Role
The Data Engineer will be responsible for building end-to-end production-grade data solutions, scalable ETL pipelines, and managing effective data storage and security. They will work closely with the Machine Learning and Data Platform team to ensure quality data ingestion and transformation while engaging with various stakeholders.
Summary Generated by Built In

Build the Future of AI Infrastructure with Kumo!


Companies invest millions in storing terabytes of data in data lakehouses, yet only a small fraction is leveraged for predictive insights. Traditional machine learning pipelines are slow and complex, requiring months of engineering effort for data preparation, feature engineering, and model training.


At Kumo, we are redefining AI infrastructure for data lakehouses, enabling businesses to harness the power of Graph Neural Networks with minimal effort. Our platform eliminates the complexities of traditional ML pipelines, allowing users to train high-performance models directly on their relational data with just a few lines of Predictive Query Language (PQL).


We are looking for Data Infrastructure Engineers to join our team and help build a scalable, high-performance ML platform. If you thrive in designing robust, cloud-native infrastructure, optimizing data pipelines, and building scalable services, we’d love to hear from you!

As a Data Infrastructure Engineer at Kumo, you will:

  • Design and optimize scalable, cloud-native infrastructure for high-performance ML workloads.
  • Develop and maintain efficient data ingestion pipelines and connectors for large-scale datasets.
  • Build and enhance resilient ETL pipelines to transform, process, and store data for analytics and ML.
  • Implement best practices for data security, governance, and sharing within distributed environments.
  • Optimize performance of data processing frameworks, including Spark, Presto, and Hive.
  • Automate deployment of infrastructure using Kubernetes, Terraform, and CI/CD tools.
  • Work closely with data scientists and ML engineers to bridge infrastructure with machine learning applications.

Your Foundation:

  • 1+ years of experience as an Infrastructure Engineer, Data Engineer, or related role in SaaS/Enterprise environments.
  • Strong expertise in building, scaling, and maintaining cloud infrastructure (AWS, GCP, or Azure).
  • Hands-on experience with data storage, ingestion, and processing in distributed environments.
  • Proficiency in ETL development and building high-performance data pipelines.
  • Solid understanding of databases, storage formats (Parquet, Avro, Arrow, JSON), and schema designs.
  • Experience working with orchestration tools such as Temporal, Airflow, or Luigi.
  • Strong programming skills in Python, Scala, or Java.
  • Knowledge of containerization and orchestration (Docker, Kubernetes).
  • Experience with Infrastructure as Code (Terraform, CloudFormation, Pulumi).
  • Ability to debug performance bottlenecks and optimize distributed computing workloads.
  • Excellent communication skills, with the ability to collaborate effectively across teams.

Bonus Points:

  • Expertise in Spark, Presto, or Hive for large-scale data processing.
  • Experience with serverless architectures and event-driven processing (AWS Lambda, Kinesis, Kafka).
  • Familiarity with Databricks, Azure Data Factory (ADF), or cloud ML solutions.
  • Understanding of high-availability, fault tolerance, and observability in cloud environments.

Why Join Kumo?

  • Be part of a cutting-edge AI and ML infrastructure team revolutionizing how companies leverage their data.
  • Work with top engineers and data scientists on solving complex, large-scale infrastructure challenges.
  • Competitive salary, equity, and benefits in a fast-growing AI company.
  • Flexible work environment with opportunities to shape the future of AI-powered data platforms.

Ready to build the next-gen AI infrastructure? Apply today!


We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Top Skills

Java
Python
Scala
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Mountain View, CA
38 Employees
On-site Workplace
Year Founded: 2021

What We Do

Democratizing AI on the Modern Data Stack!

The team behind PyG (PyG.org) is working on a turn-key solution for AI over large scale data warehouses. We believe the future of ML is a seamless integration between modern cloud data warehouses and AI algorithms. Our ML infrastructure massively simplifies the training and deployment of ML models on complex data.

With over 40,000 monthly downloads and nearly 13,000 Github stars, PyG is the ultimate platform for training and development of Graph Neural Network (GNN) architectures. GNNs -- one of the hottest areas of machine learning now -- are a class of deep learning models that generalize Transformer and CNN architectures and enable us to apply the power of deep learning to complex data. GNNs are unique in a sense that they can be applied to data of different shapes and modalities.

Similar Jobs

Cardlytics Logo Cardlytics

Senior Principal Engineer, Data Infra

AdTech • Fintech • Marketing Tech
Remote
Hybrid
2 Locations
450 Employees
260K-285K Annually

Snorkel AI Logo Snorkel AI

Staff Software Engineer — Enterprise & Data Infra

Artificial Intelligence • Machine Learning
2 Locations
120 Employees
Mountain View, CA, USA
2359 Employees
192K-243K Annually

Anthropic Logo Anthropic

Data Infra Engineer, Pretraining

Artificial Intelligence • Natural Language Processing • Generative AI
3 Locations
57 Employees
315K-340K Annually

Similar Companies Hiring

HERE Technologies Thumbnail
Software • Logistics • Internet of Things • Information Technology • Computer Vision • Automotive • Artificial Intelligence
Amsterdam, NL
6000 Employees
True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Caliola Engineering Thumbnail
Software • Machine Learning • Hardware • Defense • Data Privacy • App development • Aerospace
Colorado Springs, CO
52 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account