AI Machine Learning and Data Augmentation, Senior Scientist
SystImmune Inc. is a clinical-stage bio-pharmaceutical company dedicated to developing innovative treatments for cancer through breakthrough-therapeutic multi-specific antibodies and antibody-drug conjugates (ADCs). We are seeking a talented Senior Scientist/Programmer in our AI Drug Design (AIDD) team to lead the extension of our in-house implementation of large language model (LLM), focusing on sequence-to-structure-to-activity relationship modeling for antibody discovery, protein engineering, and immunology oncology applications.
Job Summary:
We are looking for an exceptional Senior Scientist/Programmer to drive innovation and optimization in our AI capabilities, leveraging our internal data and expertise in antibody discovery, protein engineering, and immunology oncology. The successful candidate will design, develop, and implement AI models, data pipelines, and parallel computing architectures to accelerate the discovery and development of novel therapeutics using our in-house LLM implementation.
Key Responsibilities:
- Llama 3.3 Implementation and Extension:
- Develop and fine-tune Llama 3.3 models for sequence-to-structure-to-activity relationship prediction.
- Integrate domain-specific knowledge and constraints into the Llama 3.3 framework.
- Data Generation and Processing:
- Design and implement data generation pipelines for high-quality training datasets.
- Develop and optimize algorithms for data processing and feature extraction.
- Fine-Tuning of AI Model:
- Fine-tune the Llama 3.3 model using processed SystImmune R&D data.
- Integrate additional features and constraints from SystImmune's internal data.
- Embedding Language Models:
- Embed language models to convert data into numerical representations.
- Utilize RAG with MariaDB Vector DB to enhance data retrieval capabilities.
- Automatic Data Flow Management:
- Design and implement automatic data flow management from current LIMS to AI Embedding DB.
- Develop data pipelines to extract, transform, and load data.
- Parallel Computing and Optimization:
- Implement parallel computing architectures to accelerate model training and inference.
- Optimize code performance on various computing platforms.
- Software Development and Integration:
- Design and maintain software applications for model training and data processing.
- Collaborate with the AIDD team for seamless integration of workflows.
- Data Security and Backup Management:
- Develop and implement robust data security measures.
- Design and manage backup strategies for external data.
- Data Loss Prevention and Incident Response:
- Develop procedures for preventing data loss and responding to incidents.
- Establish a disaster recovery plan.
Requirements:- Education: Ph.D. or Master's degree in Computer Science, Artificial Intelligence, Bioinformatics, or related field.
- Experience: 3+ years in AI/ML model development.
- Technical Skills:
- Proficiency in Python, C++, or Julia.
- Experience with deep learning frameworks.
- Familiarity with parallel computing architectures.
- Domain Knowledge: Strong understanding of antibody discovery and related fields.
- Communication Skills: Excellent communication and collaboration skills.
Nice to Have:- Experience with Llama 3.3 or other large language models.
- Experience with RAG and MariaDB Vector DB.
- Knowledge of process development principles.
Compensation and Benefits:The hiring pay range for this position is $150,000 - $250,000 per year based on relevant skills and experience.
SystImmune is a leading clinical-stage biopharmaceutical company located in Redmond, WA and Princeton, NJ. We offer an opportunity for you to learn and grow while making significant contributions to the company’s success. SystImmune offers a comprehensive benefits package.
SystImmune is an Equal Opportunity Employer. We welcome diverse talent and encourage all qualified applicants to apply.