AI Research Scientist (Test Time Compute) | naptha.ai
About The Role
We are seeking an exceptional AI Research Scientist to join Naptha AI at the ground floor, focusing on advancing the state of the art in test time compute optimization for large language models. In this role, you will research and develop novel approaches to improve inference efficiency, reduce computational requirements, and enhance model performance at deployment. You will work directly with our technical team to shape the architecture of our inference optimization platform.
This position addresses core technical challenges related to model compression, efficient inference strategies, and deployment optimization. You will operate at the intersection of machine learning, systems optimization, and hardware acceleration to develop practical solutions for real-world model deployment and scaling.
Core Responsibilities
- Research & Development
- Design and implement novel architectures for efficient model inference
- Develop frameworks for model compression and quantization
- Research approaches to optimize test-time computation across hardware
- Create protocols for distributed inference and resource management
- Implement and test new ideas through rapid prototyping
- Technical Innovation
- Stay updated on developments in ML efficiency and inference optimization
- Identify and solve technical challenges in model deployment
- Develop novel approaches to model compression and acceleration
- Bridge theoretical research with practical implementation
- Contribute to academic community via publications and open source
- Platform Development
- Design and implement efficient inference pipelines
- Develop scalable solutions for model deployment and serving
- Create tools for performance monitoring and optimization
- Collaborate with engineering team on implementation
- Build proofs of concept for new optimization techniques
- Leadership & Collaboration
- Work with engineering team to implement research findings
- Mentor team members on optimization techniques
- Contribute to technical strategy and roadmap
- Collaborate with external research partners
- Evaluate and integrate external research developments
Candidate Profile
Ideal candidates will have:
- Strong background in machine learning and systems optimization
- Deep understanding of model compression and inference techniques
- Experience with ML frameworks and deployment tools
- Experience with ML infrastructure and hardware acceleration
- Proven track record of implementing efficient ML systems
- Excellent programming skills (Python required, C++/CUDA a plus)
- Strong analytical and problem-solving skills
- PhD in relevant fields or equivalent experience is a plus
- Published research is a plus
Technical Experience Needed
- Python programming and ML frameworks (PyTorch, TensorFlow)
- Model optimization techniques (quantization, pruning, distillation)
- MLOps and deployment
- Hardware acceleration (GPU, TPU)
- Version control and collaborative development
- Experience with large language models
Hiring Process
- Initial technical interview
- Research presentation
- System design discussion
- Technical challenge
- Team collaboration interview
Compensation & Benefits
- Competitive salary with equity
- Remote-first environment
- Medical, dental, vision coverage
- Flexible PTO
- Learning and development budget
- Conference and publication support
- Home office setup allowance
Additional Notes
- Comfort with ambiguity and rapid iteration
- Practical approach to research ideas
- Passion for advancing efficient ML systems
- Interest in open source community
Naptha AI is committed to diversity and inclusion. We are an equal opportunity employer and welcome applications from all qualified candidates.