We are a dedicated team developing Large Language Models, one of the most prestigious and advanced ongoing Natural Language Processing projects in the world. Our team is responsible for the entire data engineering pipeline—from the collection of raw text data, through preprocessing and storage, to serving the data for model training and deployment. Additionally, our Data Engineering team is actively involved in Optical Character Recognition and image processing.
Our focus is on the rapid development and deployment of state-of-the-art information retrieval systems to meet complex information needs. As a Data Engineer, you will play a critical role in our team, owning the core data engineering tasks in our product pipeline. You will collaborate closely with cross-functional teams to provide innovative solutions to real-world problems.
To succeed in this role, you'll need a results-driven mindset, a passion for excellence, and a continuous desire to learn and improve. Your key responsibilities in this project will include:
- Utilizing programming languages such as Python, R, Scala, etc., to analyze data and build statistical models.
- Providing insights, metrics, and explanations for data variance through your technical expertise.
- Building knowledge graphs and services to support the information retrieval process.
- Implementing best-practice data quality assurance mechanisms.
- Bachelor’s degree in Computer Engineering, Software Engineering, or equivalent field.
- 3+ years of experience with data cleaning, preprocessing, and data architecture, especially with big data
- 3+ years of coding experience in at least one modern programming language (Python is preferred; R, Ruby, Scala, Java, etc. are also acceptable)
- Extensive knowledge and practical experience in several of the following areas: machine learning, statistics, deep learning, recommendation systems, information retrieval, data preparation, and web crawling
- Basic NLP skills (e.g., word embeddings, language models) to facilitate communication between end users and data. Knowledge of Large Language Models and their data preparation steps is a plus
- Basic knowledge of NoSQL databases, with a preference for Elasticsearch and MongoDB. Experience with RDBMS such as PostgreSQL or MySQL is also valuable
- Experience with data visualization tools. Grafana and Airflow is a strong plus
- Basic knowledge of Apache Spark and Hadoop is a big advantage
- Proficiency in Linux-based OS operations
- A solid understanding of search-related business scenarios and core technologies
- A passion for sharing knowledge and the confidence to seek help when needed
- Fluency in both written and spoken English
- Experience mentoring or leading teams of 5+ members, and providing technical or professional guidance, is a plus
- An eagerness to learn new technologies is highly valued
Top Skills
What We Do
Huawei is a leading global provider of information and communications technology (ICT) infrastructure and smart devices. With integrated solutions across four key domains – telecom networks, IT, smart devices, and cloud services – we are committed to bringing digital to every person, home and organization for a fully connected, intelligent world.
Huawei's end-to-end portfolio of products, solutions and services are both competitive and secure. Through open collaboration with ecosystem partners, we create lasting value for our customers, working to empower people, enrich home life, and inspire innovation in organizations of all shapes and sizes.
At Huawei, innovation focuses on customer needs. We invest heavily in basic research, concentrating on technological breakthroughs that drive the world forward. We have more than 180,000 employees, and we operate in more than 170 countries and regions. Founded in 1987, Huawei is a private company fully owned by its employees.
House Rules
This page is for ICT professionals with an interest in Huawei and our industry to engage in open discussions.
To facilitate dialogue, please follow these rules:
- Huawei holds the right to delete comments that are offensive, misleading, false, unlawful, off-topic and in violation of any regulations.
- Repeated violations of any of the above will be removed and users may be blocked.
- Huawei does not necessarily endorse the information shared by members.
- Please be familiar with and follow LinkedIn's User Agreement.
- By publicly uploading a photograph or comment, you give Huawei permission to feature your content. This will always be credited.
Please visit the below portals for career or customer service queries.
Career page: http://bit.ly/2rdljD7
Customer service: http://bit.ly/2a4mXNY
Thank you for visiting us & we hope you enjoy your time on our page.