We're looking for a talented Data Engineer to join our team and build innovative AI solutions. You'll be responsible for designing, developing, and maintaining our data and data pipeline architecture, specifically tailored for AI applications. The Data Engineer will collaborate closely with our Business Analysts, Architects, and Data Scientists on data initiatives, ensuring an optimal and consistent data delivery architecture across ongoing projects. The ideal candidate must be self-directed, adept at supporting the data needs of multiple business units, systems, and products, and possess excellent communication skills to effectively capture requirements and understand data context for robust pipeline development.
What you’ll do
- Create and maintain data pipelines using the Google Cloud Platform using the following tools: Dataflow, PubSub, BigQuery and Cloud Storage (or their equivalent in other platforms such as AWS, Azure or Hadoop)
- Work with data scientists in building and optimizing our AI solutions for greater functionality in our data systems
- Build analytics tools that utilize the data pipeline to provide actionable insights into operational efficiencies
- Identify, design and implement process improvements: optimize data delivery and automate manual processes
- Work with stakeholders including Product Owners, Technology, Data and Architecture teams to assist with data-related technical issues and support their data needs.
- Maintain data integrity and regionalization by defining boundaries through multiple GCP zones
- Design and develop visualization using Tableau to perform statistical analysis of data
What experience you need
- Bachelor’s Degree in Computer Science, Statistics, Mathematics or another quantitative field
- At least 4 years of working experience
- Experience with big data cloud tools tools: PubSub, Dataflow, BigQuery (Hadoop, Spark, Kafka, etc.)
- At least 2 years of experience with relational SQL and NoSQL databases
- At least 2 years of experience with object oriented/object function scripting languages: Python, Java, Scala, etc.
- At least 2 years of experience working with data integration tools such as Informatica, Pentaho, Talend, DataStage
What could set you apart
- GCP, AWS or Azure cloud certifications
- Experience working in an agile development environment
- Experience specifically supporting Machine Learning or AI projects, including data preprocessing, feature engineering, and building pipelines for model training/serving datasets.
- Familiarity with containerization technologies like Docker and orchestration systems like Kubernetes.
Primary Location:
CRI-Sabana
Function:
Function - Data and Analytics
Schedule:
Full time
Top Skills
What We Do
At Equifax (NYSE: EFX), we believe knowledge drives progress. As a global data, analytics, and technology company, we play an essential role in the global economy by helping financial institutions, companies, employers, and government agencies make critical decisions with greater confidence. Our unique blend of differentiated data, analytics, and cloud technology drives insights to power decisions to move people forward.
Headquartered in Atlanta and supported by nearly 15,000 employees worldwide, Equifax operates or has investments in 24 countries in North America, Central and South America, Europe, and the Asia Pacific region.
For more information, visit Equifax.com.