DataHub, built by Acryl Data, is an AI & Data Context Platform adopted by over 3,000 enterprises, including Apple, CVS Health, Netflix, and Visa. Innovated jointly with a thriving open-source community of 13,000+ members, DataHub's metadata graph provides in-depth context of AI and data assets with best-in-class scalability and extensibility.
The company's enterprise SaaS offering, DataHub Cloud, delivers a fully managed solution with AI-powered discovery, observability, and governance capabilities. Organizations rely on DataHub solutions to accelerate time-to-value from their data investments, ensure AI system reliability, and implement unified governance, enabling AI & data to work together and bring order to data chaos.
In this role, you will
- Enhance the Python-based ingestion framework to support ingesting usage statistics, lineage, and operational metadata from systems like Snowflake, Redshift, Kafka, & more!
- Build connectors for major systems in the modern data and ML stacks
- Enable the ingestion framework to run in a cloud native environment
Requirements:
- Minimum 4 years of engineering experience
- Expertise in Python
- Familiarity with tools in the modern data and ML ecosystem
- Knowledge of distributed systems
- Ability to design for scale and fault tolerance
Remote first. We're a fully distributed company, and our interaction culture is deliberately mixed between meeting culture and written. We're writing heavy because it forces clarity of thought; we have plenty of synchronous time to give space for collaborative ideation.
Benefits
- Competitive salary
- Equity
- Medical, dental, and vision insurance (99% coverage for employees, 65% coverage for dependents; USA-based employees)
- Carrot Fertility Program (USA-based employees)
- Remote friendly
- Work from home and monthly co-working space budget
Top Skills
What We Do
Founded by the leaders that built data teams at LinkedIn and Airbnb, Acryl Data enables you to take back control of your fragmented data stack. We do this by driving the #1 open source Metadata Platform DataHub, which has a community of 8,000+ data practitioners and is deployed in 1,000+ companies.
Acryl DataHub is a third-generation streaming metadata platform that integrates with 50+ tools (dbt, Kafka, Snowflake, Airflow, Looker, etc) in the data stack to enable data discovery, data lineage, data governance, and data observability.
✅ Connect to your data sources within minutes, and gain end-to-end visibility.
✅ Power mission-critical workflows with a SOC-2-compliant platform.
✅ Bring data and business teams together with a single source of truth to create governed data products.
Powering data teams at Notion, Zendesk, Riskified, and many more!