Data Scientist
CA Remote/Vancouver
Overview
Data Scientist plays a key role in developing innovative AI solutions. You will be responsible for developing scalable, modular machine learning solutions embedded in clinical research software. This role will also work with moderate independence to build validated pipelines, implement algorithm enhancements, and integrate user‑focused feedback loops. You will work closely with cross‑functional teams, including AI data engineers, AI engineers, bioinformatics analysts, software engineers, and product managers, to bring AI‑powered solutions to life that drive clinical trial acceleration and insights.
Responsibilities
- Design with scalability and user impact in mind to build state‑of‑the‑art AI models for diverse use cases in text and image processing.
- Own development and validation of specific features or model components.
- Implement basic models with a focus on reproducibility and modularity.
- Evaluate models with rigorous performance metrics and real‑world data.
- Select and implement appropriate AI technologies based on the problem and data characteristics (e.g., LLMs or other NLP techniques for text processing, classification, or generation; CNNs for image processing or predictive modeling).
- Support data processing and automation of model development workflows using R, Python, SQL, and command line tools.
- Explore data to identify patterns, trends, and anomalies important for model development using R and Python.
- Document assumptions and results clearly.
- Design feedback mechanisms into models to track drift, accuracy, and user outcomes.
- Collaborate with AI data engineers and AI engineers to operationalize models into production systems for real‑time prediction and decision‑making.
- Assist with extract, transform, and load (ETL) processes of datasets stored on services such as S3, DynamoDB, or Redshift.
- Communicate the insights and implications of your AI models to stakeholders in a clear and concise manner, bridging the gap between technical expertise and clinicians.
- Stay up to date on the latest advancements in AI research and translate them into practical applications for the company.
- Actively collaborate across engineering, product, and domain teams.
- Participate in team reviews and identify user needs.
- Other responsibilities as assigned.
Qualifications
- Strong focus on using AI technologies with text and/or image data.
- In‑depth understanding of large language models, convolutional neural networks, or other relevant AI techniques.
- Strong understanding of statistical and machine learning methods.
- Expertise in R and/or Python, and familiarity with AI libraries such as TensorFlow, PyTorch, etc.
- Experience in using Version Control software (Git, SVN, or similar) to manage programming code.
- Excellent communication and collaboration skills.
- Ability to manage multiple tasks.
- Ability to work independently, as well as in a team environment.
- Ability to effectively communicate technical concepts, in both written and oral format.
Required Education and Experience
- Bachelor’s degree in computer science, data science, or related field with 3+ years of experience as a Data Scientist or similar role. You may also have an equivalent combination of education and experience such as:
- Master’s degree in computer science, data science or related field with 2 years of experience.
- PhD in a related field with 0 years of experience.
- Experience in designing and building production‑grade AI models using AI/ML services such as SageMaker and Bedrock, preferred.
- Experience in templating reproducible analytical workflows using R Markdown and/or Jupyter Notebook.
- Experience in using Version Control software (Git, SVN, or similar) to manage programming code.
EEO Statement
The Emmes Company, LLC is an equal opportunity affirmative action employer and does not discriminate in its selection and employment practices. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, political affiliation, sexual orientation, gender identity, marital status, disability, protected veteran status, genetic information, age, or other legally protected characteristics.