Data Engineer PySpark

Posted 18 Days Ago
Be an Early Applicant
Noida, Gautam Buddha Nagar, Uttar Pradesh
3-5 Years Experience
Information Technology • Consulting
The Role
The Data Engineer will collaborate with Data Scientists to design and implement machine learning pipelines, utilizing PySpark for data processing and AWS EMR/S3 for data storage. Responsibilities include ETL workflows management, pipeline optimization, and IAM policies configuration to secure data access.
Summary Generated by Built In

Company Description

About Sopra Steria
Sopra Steria, major Tech player in Europe recognised for its consulting, digital services and software development, helps its clients drive their digital transformation and obtain tangible and sustainable benefits. It provides end-to-end solutions to make large companies and organisations more competitive by combining in-depth knowledge of a wide range of business sectors and innovative technologies with a fully collaborative approach. Sopra Steria places people at the heart of everything it does and is committed to putting digital to work for its clients in order to build a positive future for all. With 50,000 employees in nearly 30 countries, the Group generated revenue of €5.1 billion in 2022.
The world is how we shape it.

Job Description

We are seeking a highly skilled and motivated Data Engineer to join our dynamic team. As a Data Engineer, you will collaborate closely with our Data Scientists to develop and deploy machine learning models. Proficiency in below listed skills will be crucial in building and maintaining pipelines for training and inference datasets.

Responsibilities: 

• Work in tandem with Data Scientists to design, develop, and implement machine learning pipelines. 

• Utilize PySpark for data processing, transformation, and preparation for model training. 

• Leverage AWS EMR and S3 for scalable and efficient data storage and processing. 

• Implement and manage ETL workflows using Streamsets for data ingestion and transformation. 

• Design and construct pipelines to deliver high-quality training and inference datasets. 

• Collaborate with cross-functional teams to ensure smooth deployment and real-time/near real-time inferencing capabilities. 

• Optimize and fine-tune pipelines for performance, scalability, and reliability. 

• Ensure IAM policies and permissions are appropriately configured for secure data access and management. 

• Implement Spark architecture and optimize Spark jobs for scalable data processing. 


Total Experience Expected: 04-06 years

Qualifications

BE

Additional Information

Requirements: 

Mandatory

• Proficiency in Advanced SQL (Window functions), Spark Architecture, Pyspark or Scala with Spark, Hadoop.

• Proven expertise in designing and deploying data pipelines.

• Strong problem-solving skills and ability to work effectively in a collaborative team environment. 

• Excellent communication skills and ability to translate technical concepts to non-technical stakeholder

Desirable

• Hands-on experience with Airflow, S3, and Stream sets or similar ETL tools. [ can be trained locally ]

• Understanding of real-time or near real-time inferencing architectures. 

  • •Basic Knowledge on Kafka ,AWS IAM, AWS EMR and Snowflake.

At our organization, we are committed to fighting against all forms of discrimination. We foster a work environment that is inclusive and respectful of all differences.

All of our positions are open to people with disabilities.

Top Skills

Pyspark
Scala
SQL
The Company
HQ: Paris
49,329 Employees
On-site Workplace

What We Do

Sopra Steria, a major Tech player in Europe with 56,000 employees in nearly 30 countries, is recognised for its consulting, digital services and software development. It helps its clients drive their digital transformation and obtain tangible and sustainable benefits. The Group provides end-to-end solutions to make large companies and organisations more competitive by combining in-depth knowledge of a wide range of business sectors and innovative technologies with a fully collaborative approach. Sopra Steria places people at the heart of everything it does and is committed to putting digital to work for its clients in order to build a positive future for all. In 2023, the Group generated revenues of €5.8 billion.

The world is how we shape it

Jobs at Similar Companies

Silverfort Logo Silverfort

Sales Engineer- TOLA

Information Technology • Sales • Security • Cybersecurity • Automation
Remote
United States
357 Employees

Jobba Trade Technologies, Inc. Logo Jobba Trade Technologies, Inc.

Customer Success Specialist

Cloud • Information Technology • Productivity • Professional Services • Software
Hybrid
Chicago, IL, USA
45 Employees

InCommodities Logo InCommodities

Head of People & Culture - US

Information Technology • Machine Learning • Analytics • Energy • Automation • Renewable Energy
Hybrid
Austin, TX, USA
234 Employees

Similar Companies Hiring

Silverfort Thumbnail
Security • Sales • Information Technology • Cybersecurity • Automation
GB
357 Employees
Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Cloud
Chicago, IL
45 Employees
InCommodities Thumbnail
Renewable Energy • Machine Learning • Information Technology • Energy • Automation • Analytics
Austin, TX
234 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account