¡Activa las notificaciones laborales por email!
A transformative tech firm is seeking a Senior Data Scientist to join their initiative focused on AI and data modernization. The role involves developing customer clustering models and leveraging advanced analytics techniques. Candidates should have over 4 years of experience, proficiency in Python, and familiarity with AWS tools. This position offers a Spanish contract, with a focus on remote work.
Senior Data Scientist
Location: Remote from Spain (Spanish contract)
Join a transformative data and AI platform initiative aimed at modernizing enterprise-scale capabilities and enabling real-time decision-making. This project delivers a comprehensive roadmap covering AI, MLOps, data governance, and platform scalability, supporting a shift towards data-first operations and intelligent automation.
Requirements:
- 4+ years of experience as a Data Scientist, with deep expertise in unsupervised learning, clustering, and advanced exploratory data analysis.
- Strong hands-on experience with SHAP or similar model interpretability techniques.
- Proficiency in Python, Pandas, SQL, Jupyter,
and common data manipulation and visualization tools.
- Familiarity with AWS ecosystem tools like S3, RDS, IAM, and BI solutions such as QuickSight.
- Experience designing and building GenAI or LLM-based workflows, including prompt engineering and integrating APIs.
- Ability to benchmark different LLM solutions and assess their performance for specific summarization and recommendation use cases.
- Skilled in transforming raw outputs into compelling, business-relevant insights for both technical and non-technical audiences.
- Nice to have
- Experience implementing RAG pipelines with vector databases and domain document ingestion.
- Exposure to MLOps workflows and tooling (e.g. MLflow, SageMaker, Airflow, Terraform).
- Prior work on integrating BI platforms with AI/ML pipelines.
- Background in identity verification
Responsibilities:
- Drive the development and evolution of customer clustering models using unsupervised learning to identify patterns in pass rate performance and flag inconsistencies.
- Lead SHAP-based explainability initiatives to uncover the root causes behind verification failures and create dynamic, on-demand explanations.
- Conduct benchmarking of LLM APIs, assessing summarization quality, latency, relevance, and cost to inform GenAI solution design.
- Collaborate on pipeline development to extract, preprocess, and format QuickSight reports for GenAI consumption.
- Build and test proof-of-concept RAG pipelines that enhance LLMs with domain-specific context from historical documents and verification reports.
- Work closely with Delivery Managers to translate complex analytics and model outputs into business-friendly visualizations and narratives.
- Continuously refine clustering methodology by evaluating alternative models,
tuning hyperparameters, and expanding criteria.
- Partner with MLOps engineers to ensure seamless integration of data science pipelines into the broader infrastructure, with a focus on automation and scalability.