Salary Range: $80,000 - $92,000 per year
We believe work should be innately rewarding and a team-building venture. When we work with our teammates and our clients it should be an enjoyable journey where we can learn, grow as professionals, and achieve amazing results. Our core values revolve around this philosophy. We are relentlessly committed to helping our clients achieve their business goals, leapfrog competition, and become leaders in their industry. What drives us forward is the culture of creativity combined with a disciplined approach, passion for learning & innovating, and a ‘can do’ attitude!
What We're Looking For
You, as a Data Engineer, will use various methods to transform raw data into useful data systems. For example, you’ll create algorithms and conduct statistical analysis. Overall, you’ll strive for efficiency by aligning data systems with business goals.
To succeed in this data engineering position, you should have strong analytical skills and the ability to combine data from different sources. Data engineer skills also include familiarity with several programming languages and knowledge of learning machine methods.
Position Overview
As a Data Engineer, you should be passionate about technology and want to work with a mature, intensely skilled team, values total ownership of your work, and can’t imagine a day without coding, we want to speak to you! We're looking for a creative, focused, technically curious individual who enjoys both designs as well as working hands-on with the code.
As a Data Engineer, you will be working within a team to design and implement highly available data services and pipelines. This is an ideal job if you have proven experience as a technical professional and have already collaborated with teams to deliver production systems based on big data solutions.
Role Responsibilities
- Data Integration Management: Implement system-wide data integration, ensuring seamless data flow across components and services. Implement and manage data pipelines for both real-time and batch data processing.
- ETL and Data Pipeline Development: Design, build, implement, and optimize ETL processes using AWS Glue, Python, and Apache Spark to create curated data tables. Develop and maintain AWS Glue Jobs, Crawlers, and Data Catalogs to automate data processing and metadata management.
- Database Management: Maintain raw and curated data tables within the AWS Redshift database and promote data tables across environments (DEV, TEST, IMPL, PROD), where applicable.
- AWS Ecosystem Expertise: Leverage AWS services such as AWS Glue, AWS Lambda, AWS Redshift, and S3 to manage and optimize data workflows.
- Infrastructure as Code (IaC): Utilize AWS CloudFormation to provision, manage, and maintain data infrastructure resources, ensuring repeatability and compliance with best practices. Use GitHub for version control of templates and ETL scripts.
- Data Quality & Governance: Implement data validation, cleansing, and monitoring strategies within ETL pipelines to ensure data accuracy, consistency, and compliance with organizational policies. Perform Quality Assurance (QA) checks on the curated data tables in collaboration with Tableau developers, data scientists, and data SMEs.
- Collaboration & Optimization: Work closely with data scientists, analysts, and application teams to optimize data structures, queries, and transformations for performance and cost-effectiveness.
- Security & Compliance: Ensure compliance with security best practices, access control policies, and regulatory requirements. Implement encryption and role-based access control (RBAC) as needed.
- Troubleshooting & Performance Tuning: Identify and resolve bottlenecks in data pipelines, optimize AWS Glue job performance, and ensure cost-efficient execution of workflows.
- Code Review & Version Control: Conduct peer reviews of Python and SQL code for ETLs, collaborate with other data engineers, and commit changes to GitHub.
- Statistical Programming & Documentation: Develop statistical code (SAS, R, Python) and associated reference documentation to address specific use cases.
- AWS Workspaces Testing Support: Assist Cloud Platform in testing programming applications and features within AWS Workspaces
What You’re Looking For
If you’re looking for an opportunity to work in a fast-growing market, surrounded by talented, motivated, and global colleagues who thrive on helping clients meet their most pressing business goals, we are the company for you. If you’re driven, passionate, and want to be a ’key player’ in a company’s growth, we invite you to make a difference with a company that’s defined by its employees. We want you to be bold, take risks, and imagine a better way to work. We should talk if we just described you!
About Us
OZ is a 25-year-old global technology consulting, services, and solutions leader specializing in creating business-focused solutions for our clients by leveraging disruptive digital technologies and innovation.
OZ is committed to creating a continuum between work and life by allowing most of our team members to work remotely. We offer competitive compensation and a comprehensive benefits package including, but not limited to, full health benefits, 401K, and unlimited PTO. You’ll enjoy our workstyle within an incredible culture. We’ll give you the tools you need to succeed so you can grow and develop with us and become part of a team that lives by its core values.
Basic Qualifications
- Bachelor’s and/or Master’s degree in Computer Science
- 2+ years of work experience with ETL, Data Modeling, and Data Architecture.
- Expert-level skills in writing and optimizing SQL.
- Solid Linux skills.
- Experience operating very large data warehouses or data lakes
- Expertise in ETL optimization, designing, coding, and tuning big data processes using Apache Spark or similar technologies.
- Experience with building data pipelines and applications to stream and process datasets at low latencies.
- Show efficiency in handling data - tracking data lineage, ensuring data quality, and improving discoverability of data.
- Sound knowledge of distributed systems and data architecture (lambda)- design and implement batch and stream data processing pipelines and knows how to optimize the distribution, partitioning, and MPP of high-level data structures.
Preferred Skills
- AWS Glue Jobs, Crawlers, and Data Catalogs
- AWS Lambda, AWS Redshift, and S3
- AWS Cloud Formation
- SAS, R, Python (nice to have)
- GitHub
- Agile SDLC, JIRA, Confluence
- SQL, NOSQL
Top Skills
What We Do
We are a leading consulting company whose services and solutions leverage Intelligent Automation to accelerate processes and provide detailed business insights. With specialties in data analytics, artificial intelligence (AI), robotic process automation (RPA), and more, our experts can enhance technology infrastructures to provide accurate reports, inform decision making, and improve customer satisfaction.