Senior Data Engineer

Workstream • Shanghai, Shanghai, People's Republic of China • 1w ago

We are in search of a seasoned Data Engineer who has a strong background in Machine Learning Engineering. In this unique role, you will primarily focus on data engineering tasks while leveraging your ML expertise to collaborate with data scientists for model deployment.

The ideal candidate is someone who excels in building robust data infrastructures and also has a proven track record in helping data teams deploy machine learning models in real-world environments. You'll be working hand-in-hand with our platform team, supporting and maintaining our data infrastructure, and concurrently assisting data scientists in model deployment and monitoring.

Day in the Life

Design, build, and maintain efficient, reliable, and complex ETL pipelines to process and analyze large volumes of data from various sources.
Develop and enhance our data lakehouse, driving data quality across departments and building self-service tools for analysts.
Define, build, and own data architecture for a trusted, governed, dimensionally-modeled repository of data.
Collaborate with cross-functional teams including data scientists to assist in deploying and monitoring machine learning models in production environments.
Help data scientists develop and maintain ML API services for seamless integration into the company's infrastructure.
Apply knowledge of real-time, streaming, and batch processing concepts to optimize model performance and data handling.
Participate in code and design reviews to maintain high development standards.

Who You Are

Bachelor's/Master's degree in Computer Science, Data Science, or a related quantitative field.
Proficiency in Python and software engineering.
Proven experience as a Data Engineer, with a solid understanding of SQL, and Big Data technologies.
Expertise in containerization and orchestration technologies like Docker and Kubernetes.
Knowledge of vector stores, databases, and data warehousing concepts.
Experience in deploying, and monitoring ML API services using Flask or FastAPI.
Strong project management skills, with the ability to collaborate effectively with cross-functional teams.

Preferred Qualifications:

Experience with Hevo Data or other streaming vendors(Fivetran, Airbyte, DMS)
Experience with DBT
Experience with Snowflake or Redshift
Experience with orchestration tools such as Airflow
Experience with data catalog solutions such as Atlan
Experience with Metaflow is a plus
Experience with cloud platforms such as AWS, GCP, or Azure
Experience with specialized ML serving tools like Bento, Seldon Core, Hugging Face Inference, Sagemaker Endpoints is a plus.

What We Offer:

A mission-driven and value-based company dedicated to empower deskless workers and local businesses
An early employee opportunity at a Series B hyper-growth startup
Work with the founding team and industry veterans to accelerate your career
Competitive salary and equity
Comprehensive health coverage
Performance-based year-end bonuses
Unlimited PTO
Remote/WFH schedule