About the job
The world’s most critical--and at risk--business applications have been neglected for far too long. Onapsis eliminates this blind spot by providing cybersecurity solutions dedicated to business-critical applications. Whether running on premises, in the cloud, or in a hybrid environment, Onapsis helps nearly 30% of the Forbes Global 100 understand the threats and risks across their SAP and Oracle landscapes.
We are seeking a Principal Data Engineer to join our mission-driven team. This role is ideal for experienced data engineers with a proven track record in architecting scalable data pipelines, leveraging cloud technologies, and contributing to high-impact cybersecurity solutions. You will be responsible for building high-performance ETL frameworks, optimizing data platforms, and contributing directly to the enhancement of our customers' threat detection, response, and remediation capabilities.
What you will be doing, your legacy:
You will be working directly with company Principal Engineers evaluating, scoping, proposing, and building features to fulfill business solution requirements to protect our customers. Additionally, you will be working with Engineering and DevOps to deliver high-quality products and services while also working closely with security and IT professionals to ensure safe and secure best practices are followed.
Responsibilities:
- Architect and Design Scalable Data Solutions: Design, develop, and maintain highly-scalable ETL/ELT pipelines across diverse data domains using cloud technologies like AWS (Glue, Redshift, Lambda, EMR, S3) and Azure (Data Factory, Synapse, Databricks).
- Data Pipeline Development: Implement data models and data processing frameworks (Spark, Kafka, Snowflake) to ingest, transform, and load large datasets (100+ TB), ensuring high availability and reliability of data.
- Advanced Data Integration: Develop solutions that integrate multiple data sources into Snowflake or similar data warehouses to enable real-time analytics and reporting across dashboards.
- AI/ML Integration: Collaborate with cross-functional teams to co-develop AI-driven features like text summarization and chatbot functionalities using AWS Bedrock, SageMaker, or similar AI/ML technologies, reducing response times and enhancing decision-making capabilities.
- Compliance and Security: Ensure compliance with industry standards and secure best practices (SOX, SOC 1/2), by implementing data governance frameworks, monitoring data pipelines, and optimizing cloud database architectures to protect sensitive information.
- Stakeholder Collaboration: Work closely with stakeholders, including analysts, engineers, and product managers, to understand their data needs, propose solutions, and drive data-driven decision-making by delivering actionable insights.
- Data Infrastructure Monitoring: Continuously monitor, troubleshoot, and enhance data pipelines, leveraging CI/CD tools (Docker, Jenkins, GitHub Actions) and orchestrating workflows using Apache Airflow to maintain operational efficiency.
- Leadership and Mentorship: Provide technical leadership within the data platform organization, leading the implementation of cutting-edge cloud technologies and mentoring junior data engineers in best practices and advanced data management techniques.
- Cloud Migration: Lead large-scale database migrations from on-premises environments (Oracle, SQL Server) to cloud-based solutions like Snowflake and AWS, improving query performance and reducing technical debt.
- Documentation and Governance: Establish comprehensive documentation for data architecture, governance, and processes to ensure scalability, compliance, and security.
Qualifications:
- 5+ years of proven experience as a Data Engineer or in a similar role with a deep understanding of data architecture and cloud-based ETL/ELT frameworks.
- Strong experience with AWS and/or Azure cloud services, particularly with Glue, Redshift, Lambda, Step Functions, Databricks, Synapse, and Snowflake.
- Proficiency in big data technologies such as Apache Spark, Kafka, Hadoop, and Databricks for distributed data processing.
- Strong programming skills in Python and SQL, with experience in advanced data modeling (star, snowflake schemas) and partitioning techniques.
- Hands-on experience in building real-time data processing and AI/ML-driven analytics solutions (SageMaker, Bedrock, NLP, Power BI).
- Proven ability to architect and manage data warehouse solutions (e.g., Snowflake, Redshift) for enterprise-grade performance and reliability.
- Familiarity with compliance and audit requirements (SOX, SOC 1/2, GDPR) and implementing data governance and security frameworks.
- Strong problem-solving skills with a focus on data integrity, scalability, and performance optimization.
- Experience with CI/CD tools (Jenkins, GitHub Actions, Docker) and data orchestration platforms (Apache Airflow).
Preferred Qualifications:
- Experience with advanced data architecture principles (medallion architecture, materialized views, task scheduling).
- Proven track record of successful cloud migrations for large datasets and optimizing query performance in Snowflake or similar platforms.
- Familiarity with real-time analytics using Tableau, Power BI, and other BI tools to drive decision-making and reduce reporting lag.
- Leadership experience, including mentoring junior engineers and leading technical projects.
What we offer:
- A role in shaping the future of protecting the most critical applications that run the world's business and a career that grows as the company grows.
- A unique culture of high achievement and teamwork.
- Supportive and humble colleagues are the space's top problem solvers and innovators.
- Financial security through competitive compensation and incentives.
Location: Onapsis is establishing a new development center in Bucharest. This is a hybrid role, so candidates must be commutable to Bucharest every week. About Onapsis:
Onapsis protects the business applications that run the global economy. The Onapsis Platform delivers vulnerability management, change assurance, and continuous compliance for business applications from leading vendors such as SAP, Oracle, and others. The Onapsis Platform is powered by the Onapsis Research Labs, the team responsible for the discovery and mitigation of more than 1,000 zero-day vulnerabilities in business applications.
Onapsis is headquartered in Boston, MA, with offices in Heidelberg, Germany and Buenos Aires, Argentina, and proudly serves hundreds of the world’s leading brands, including close to 30% of the Forbes Global 100, six of the top 10 automotive companies, five of the top 10 chemical companies, four of the top 10 technology companies, and three of the top 10 oil and gas companies.
For more information, connect with Onapsis on LinkedIn or visit https://www.onapsis.com.
#LI-AC1
#LI-Hybrid
Top Skills
What We Do
Onapsis protects the mission-critical applications that run the global economy, from the core to the cloud. The Onapsis Platform uniquely delivers actionable insight, secure change, automated governance and continuous monitoring for critical systems—ERP, CRM, PLM, HCM, SCM and BI applications—from leading vendors such as SAP, Oracle, Salesforce and others.
Onapsis is headquartered in Boston, MA, with offices in Heidelberg, Germany and Buenos Aires, Argentina. We proudly serve more than 300 of the world’s leading brands, including 20% of the Fortune 100, 6 of the top 10 automotive companies, 5 of the top 10 chemical companies, 4 of the top 10 technology companies and 3 of the top 10 oil and gas companies.
The Onapsis Platform is powered by the Onapsis Research Labs, the team responsible for the discovery and mitigation of more than 800 zero-day vulnerabilities in mission-critical applications. The reach of our threat research and platform is broadened through leading consulting and audit firms such as Accenture, Deloitte, IBM, PwC and Verizon—making Onapsis solutions the de-facto standard in helping organizations protect their cloud, hybrid and on-premises mission-critical information and processes.