Short Summary:
We are building Machine Learning Platform to enable MLOPs capabilities to help Data scientists and ML engineers at Target to implement ML solutions at scale. It encompasses building the Featurestore, Model ops, experimentation, iteration, monitoring, explainability, and continuous improvement of the machine learning lifecycle. You will be part of a team building scalable applications by leverage latest technologies. Connect with us if you want to join us in this exiting journey.
Roles and responsibilities:
-
Build and maintain Machine learning infrastructure that is scalable, reliable and efficient.
-
Familiar with Google cloud infrastructure and MLOPS
-
Write highly scalable APIs. Deploy and maintain machine learning models, pipelines and workflows in production environment.
-
Collaborate with data scientists and software engineers to design and implement machine learning workflows.
-
Implement monitoring and logging tools to ensure that machine learning models are performing optimally.
-
Continuously improve the performance, scalability and reliability of machine learning systems.
-
Work with teams to deploy and manage infrastructure for machine learning services.
-
Create and maintain technical documentation for machine learning infrastructure and workflows.
-
Stay up to date with the latest developments in technologies.
Tech stack: GCP cloud skills, GCP Machine Learning Engineer skills , GCP VertexAI skills, Python, Microservices, API development Cassandra, Elastic Search, Postgres, Kafka, Docker, CICD, optional (Java + Spring boot)Required Skills:
-
Bachelor's or Master's degree in computer science, engineering or related field.
-
9+ years of experience in software development, machine learning engineering.
-
A Lead Machine Learning Engineer specializing in Google Cloud (GCP) needs a deep understanding of machine learning (ML) principles, cloud infrastructure and MLOps
-
Hands-on experience with Vertex AI to manage ML platform for Feature engineering, ML training & deploying models
-
VertexAI Skills needed are: BigQueyML, Automating ML workflows using Kubeflow (KFP) or Cloud composer, AI APIs, Endpoints for real-time inference, Model Monitoring, Cloud Logging & Monitoring, Cloud Dataflow for stream processing, Cloud Dataproc (Spark & Hadoop) for distributed ML workloads
-
Deep experience with Python, API development, microservices. Creating ML-powered REST APIs using FastAPI, Flask, Cloud Functions
-
Java (Optional, but useful for production ML systems)
-
Expert in building high-performance APIs.
-
Experience with DevOps practices, containerization and tools such as Kubernetes, Docker, Jenkins, Git.
-
Good understanding of machine learning concepts and frameworks, deep learning, LLM etc.
-
Good to have experience in deploying machine learning models in a production environment.
-
Good to have experience with data streaming technologies such as Kafka, Dataflow, Kinesis, Pub/Sub etc.
-
Strong analytical and problem-solving skills
-
Good to have GCP certification - Professional Machine Learning Engineer
Top Skills
What We Do
Target is an American retailing company providing access to a wide selection of products such as furniture, electronics, toys, and more.
Target is one of the world’s most recognized brands and one of America’s leading retailers. We make Target our guests’ preferred shopping destination by offering outstanding value, inspiration, innovation and an exceptional guest experience that no other retailer can deliver. Target is committed to responsible corporate citizenship, ethical business practices, environmental stewardship and generous community support. Since 1946, we have given 5 percent of our profits back to our communities. Our goal is to work as one team to fulfill our unique brand promise to our guests, wherever and whenever they choose to shop.