Software Engineer, Data Infrastructure Lead

Posted 9 Days Ago
Be an Early Applicant
2 Locations
Senior level
Artificial Intelligence • Software
The Role
Design and maintain large-scale batch processing pipelines, optimize data handling and support researchers with their data needs and modeling requirements.
Summary Generated by Built In

Who we are

EvolutionaryScale’s mission is to develop artificial intelligence to understand biology for the benefit of human health and society, through open, safe, and responsible research, and in partnership with the scientific community. Over the next ten years AI will transform biological design, making molecules and entire cells programmable. We will develop the foundation models for biology that enable this.

The EvolutionaryScale team is based in San Francisco and New York. We believe in flexibility around work schedules and locations, but expect that our team members will work half of the days or more of most weeks from one of our offices.

What you’ll do

As a Data Infrastructure Engineer, you will work closely with bioinformatics and research teams to ensure our data jobs are reliable, efficient, and scalable. You'll implement best practices for handling large-scale data processing, select and integrate the right technologies, and drive continuous improvements in performance and quality of our data sets.

The role

  • Design, develop, and maintain large-scale batch processing pipelines using tools like Spark and Ray, for acquiring biology datasets.
  • Manage data infrastructure components to ensure robust and fault-tolerant operations.
  • Optimize data ingestion, storage, and retrieval processes for acquiring large and growing biology datasets, and for efficient pre and post training data ingestion.
  • Create systems for easy and reproducible data evaluation and experiments.
  • Integrate modern ML based data curation technologies with data processing pipelines.
  • Work with researchers and other engineering teams to understand data needs, create solutions that meet modeling requirements.

Preferred qualifications

Apply even if you don’t meet all of these!

  • Proven experience with large-scale data processing systems using technologies such as Hadoop, Spark, or Ray.
  • Knowledge of streaming data frameworks like Kafka Streams, Spark Streaming, or Flink.
  • Understanding of data processing principles and best practices.
  • Strong problem-solving skills, including the ability to research, debug, and resolve complex technical problems.
  • Experience with major cloud providers (AWS, GCP, or Azure), including familiarity with data warehousing tools is a plus.
  • Knowledge of biology and biology datasets is a big plus but not required.
  • Experience with large scale distributed systems or machine learning is also not required but a plus.
  • 5+ years of experience in the above systems.


Top Skills

AWS
Azure
Flink
GCP
Hadoop
Kafka Streams
Ray
Spark
Spark Streaming
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
18 Employees
Remote Workplace

What We Do

Company behind ESM3

Similar Jobs

PwC Logo PwC

.NET Developer - Manager

Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Remote
Hybrid
68 Locations
370000 Employees
100K-232K Annually

PwC Logo PwC

Performance Test Engineer - Senior Manager

Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Remote
Hybrid
68 Locations
370000 Employees
100K-232K Annually

NBCUniversal Logo NBCUniversal

Associate Cloud Post Engineer

AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
Remote
Hybrid
New York, NY, USA
68000 Employees
80K-110K Annually
Hybrid
New York, NY, USA
289097 Employees

Similar Companies Hiring

True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Caliola Engineering Thumbnail
Software • Machine Learning • Hardware • Defense • Data Privacy • App development • Aerospace
Colorado Springs, CO
53 Employees
Red 6 Thumbnail
Virtual Reality • Software • Hardware • Defense • Aerospace
Orlando, Florida
113 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account