Ativa os alertas de emprego por e-mail!

Machine Learning Engineer - Web Data Quality

Zyte

Rio de Janeiro

Teletrabalho

BRL 160.000 - 200.000

Tempo integral

Há 30+ dias

Resumo da oferta

A data technology company based in Brazil is seeking a Machine Learning Engineer to design and implement systems for improving web data quality. You will develop ML models and collaborate closely with teams to enhance data accuracy at scale. Ideal candidates have 3+ years in AI Engineering and strong Python skills. This role offers a remote-first work environment with generous benefits including 35 days of paid time off.

Serviços

35 days of paid time off
Health & wellness support
Inclusive and supportive team environment
Attend conferences
Work with cutting-edge technologies

Qualificações

  • 3+ years of experience in Machine Learning, Data Science, or AI Engineering.
  • Strong Python skills and experience with ML frameworks.
  • Understanding of model evaluation, metrics, and deployment best practices.

Responsabilidades

  • Develop and deploy ML models for anomaly detection and content validation.
  • Build data quality pipelines leveraging modern tools.
  • Collaborate with engineers to integrate AI systems into production workflows.

Conhecimentos

Machine Learning / Data Science / AI Engineering
Python
Data validation
Anomaly detection
Collaboration

Ferramentas

PyTorch
TensorFlow
scikit-learn
Airflow
Spark
Descrição da oferta de emprego

At Zyte, we make the world’s web data accessible to everyone. Our technology powers data extraction at scale, helping businesses and researchers unlock the full potential of the web.

We’re a remote-first, multicultural team of engineers, data scientists, and innovators who believe in curiosity, collaboration, and continuous learning. If you’re passionate about building reliable AI systems and improving the quality of web data, we’d love to hear from you.

About the Role

As a Machine Learning Engineer (Web Data Quality), you’ll design and implement intelligent systems that automatically detect, measure, and improve the quality of large-scale web datasets. You’ll work at the intersection of data science, AI, and distributed systems, collaborating closely with product, engineering, and data teams to make data accuracy measurable, scalable, and actionable.

What You’ll Do
  • Develop and deploy ML models for anomaly detection, schema drift, and content validation
  • Build and improve data quality pipelines leveraging modern data and MLOps tools
  • Design and optimize embeddings and GenAI models to enhance data consistency
  • Collaborate with engineers to integrate AI systems into production workflows
  • Conduct experiments, evaluate performance, and iterate for continuous improvement
  • Stay up to date on AI/ML and GenAI research to guide innovation within Zyte
Required
  • 3+ years of experience in Machine Learning / Data Science / AI Engineering
  • Strong Python skills and experience with ML frameworks (PyTorch, TensorFlow, scikit-learn)
  • Experience with data validation, anomaly detection, or data quality systems
  • Familiarity with data pipelines (Airflow, Spark, or similar)
  • Understanding of model evaluation, metrics, and deployment best practices
  • Excellent problem-solving, communication, and collaboration skills
Preferred
  • Experience with LangChain, LlamaIndex, or GenAI model orchestration
  • Familiarity with data labeling tools and active learning approaches
  • Contributions to open-source or public ML projects
  • Experience working in a remote, cross-functional team environment
  • 35 days of paid time off
  • Health & wellness support
  • Inclusive and supportive team environment
  • Attend conferences and meet with team members from across the globe.
  • Work with cutting-edge open source technologies and tools
Obtém a tua avaliação gratuita e confidencial do currículo.
ou arrasta um ficheiro em formato PDF, DOC, DOCX, ODT ou PAGES até 5 MB.