Data Engineer - Research

Posted 3 Hours Ago
Be an Early Applicant
Hiring Remotely in United States
Remote
Mid level
Generative AI
The Role
The Data Engineer will build and maintain the data infrastructure for training AI models, managing large-scale data processing, ensuring data quality, and collaborating with research teams to meet data needs. Responsibilities include preprocessing data, developing tools for data accessibility, and organizing unstructured data.
Summary Generated by Built In

(USA remote)
About the role:

We are looking for a talented Data Engineer to be part of our Data team. You will work alongside a growing multidisciplinary team of talented research scientists and machine learning engineers to improve and scale the efficiency within our models. You will contribute in building the data infrastructure that manages our data and powers the training of all Stability AI models. The position is based in Germany, remote.

Responsibilities:

  • Clean, normalize, and preprocess data in a scalable, parallelizable way to prepare it for ingestion into our machine learning model training pipelines while ensuring data quality
  • Design, implement, and maintain scalable data infrastructure for generative AI
  • Develop tools to search and serve the data at scale
  • Collaborate with multiple Research teams to understand and meet their data requirements
  • Develop and manage data processing pipelines to support machine learning teams
  • Maintain and improve data quality and integrity across various databases and data stores
  • Manage and organize large-scale unstructured data, including image, text, audio, video and 3D

Qualifications:

  • Proven background within large-scale distributed workloads
  • Experience with large-scale data loading for machine learning training runs
  • Experience with cloud storage and file systems. AWS (S3) is strongly preferred, but open to other cloud platforms
  • Experience with Python
  • Experience with large-scale data processing and software development for unstructured data.
  • Expertise in database, data lake, and data warehouse technologies (Redshift, BigQuery, Snowflake).
  • Experience working with machine learning projects and ideally some deep learning / computer vision knowledge
  • Good teamwork and communication skills based on experience working with a distributed international team with timezone and cultural differences
  • Excellent communication skills to effectively collaborate with users, solve issues, and provide guidance
  • Attention to detail and the ability to document processes and solutions effectively

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

Top Skills

Python
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: London
149 Employees
On-site Workplace

What We Do

Stability AI is building open AI tools that will let us reach our potential.

Designing and implementing solutions using collective intelligence and augmented technology.

Similar Jobs

Arcadia Logo Arcadia

Data Engineer

Big Data • Fitness • Healthtech • Software • Analytics • Energy
Remote
USA
370 Employees

NBCUniversal Logo NBCUniversal

Associate Data Engineer

AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
Remote
New York, NY, USA
68000 Employees
65K-90K Annually

AVM Consulting Logo AVM Consulting

Sr. Data Engineer with AWS experience

Information Technology • Software • Consulting
Remote
Los Angeles, CA, USA
100 Employees
100K-220K Annually

Upstart Logo Upstart

Data Platform Engineer

Artificial Intelligence • Fintech • Machine Learning • Social Impact • Software
Easy Apply
Remote
2 Locations
1500 Employees
142K-197K Annually

Similar Companies Hiring

Accuris Thumbnail
Software • Manufacturing • Machine Learning • Information Technology • Generative AI • Conversational AI
Denver, CO
1200 Employees
SAG LLC Thumbnail
Virtual Reality • Generative AI • Business Intelligence • Big Data Analytics • App development • Analytics • Agriculture
Minot, ND
4 Employees
Hedra Thumbnail
Software • News + Entertainment • Marketing Tech • Generative AI • Enterprise Web • Digital Media • Consumer Web
San Francisco, CA
14 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account