Enable job alerts via email!

LLM Architect

AllCloud

United States

On-site

USD 75,000 - 169,000

Full time

Yesterday
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

AllCloud, a leading cloud enablement company, seeks an innovative LLM Architect to design and develop custom language models. The ideal candidate will possess deep expertise in NLP and transformer model design, working alongside engineers to create state-of-the-art language models that meet specific customer requirements.

Benefits

Medical insurance
Vision insurance

Qualifications

  • 4+ years of experience in deep learning research or development focused on NLP.
  • Strong understanding of transformer architecture and its variants.
  • Expertise in designing and training large language models from scratch.

Responsibilities

  • Design custom transformer-based language model architectures tailored to specific use cases.
  • Implement techniques to optimize model size and performance.
  • Collaborate with GPU Engineers to implement efficient training strategies.

Skills

Deep Learning
Natural Language Processing (NLP)
Transformer Models
Model Compression Techniques
Mathematics

Education

Master's or PhD in Computer Science

Tools

PyTorch
TensorFlow

Job description

Direct message the job poster from AllCloud

Job Type: Full-time, Permanent

About AllCloud

AllCloud is a global professional services company providing organizations with cloud enablement and transformation tools. As an AWS Premier Consulting Partner and audited MSP, a Salesforce Platinum Partner, and a Snowflake Premier Partner, AllCloud helps clients connect their front and back offices by building a new operating model to harness the benefits of cloud technology and data and analytics.

Job Summary

We are looking for an innovative LLM Architect to lead the design and development of custom language models at AllCloud. This role will be responsible for architecting, training, and optimizing large language models based on modified transformer architectures. The ideal candidate will have deep expertise in NLP, transformer model design, and efficient training methodologies. You'll work alongside GPU Engineers and ML Engineers to create state-of-the-art language models that meet our customers' specific requirements, pushing the boundaries of what's possible with generative AI.

Responsibilities

  • Design custom transformer-based language model architectures tailored to specific use cases
  • Develop and implement modifications to transformer architectures to enhance performance, efficiency, or capabilities
  • Create and execute model pre-training, fine-tuning, and evaluation strategies
  • Implement techniques like quantization, pruning, and knowledge distillation to optimize model size and performance
  • Design and implement training data pipelines, including data selection, cleaning, and augmentation
  • Establish rigorous evaluation frameworks to assess model performance, fairness, and safety
  • Research and implement state-of-the-art techniques in LLM development
  • Create detailed documentation on model architectures, training methodologies, and performance characteristics
  • Collaborate with GPU Engineers to implement efficient training strategies across distributed systems
  • Work with customers to understand their unique requirements and translate them into model design decisions

Summary of Key Requirements

  • 4+ years of experience in deep learning research or development with a focus on NLP and transformer models
  • Strong understanding of transformer architecture and its variants (GPT, BERT, T5, etc.)
  • Experience designing and training large language models from scratch
  • Expertise in PyTorch or TensorFlow for implementing custom model architectures
  • Knowledge of distributed training approaches for large models (DeepSpeed, Megatron, etc.)
  • Experience with model compression techniques (quantization, pruning, knowledge distillation)
  • Strong background in mathematics, particularly linear algebra, differential equations, probability, and statistics
  • Familiarity with current research in LLM development, including attention mechanisms, mixture of experts, and efficient training methods
  • Master's or PhD in Computer Science, Machine Learning, or related field
  • Publication record in NLP, LLMs, or transformer architecture (strongly preferred)

Certifications

  • AWS Machine Learning Specialty (Strongly Preferred)

Why work for us?

Our team inspires progress in each other and in our customers through our relentless pursuit of excellence; you will work with leaders who promote learning and personal development.

AllCloud is an Equal Opportunity Employer and considers applicants for employment without regard to race, color, religion, sex, orientation, national origin, age, disability, genetics or any other basis forbidden under federal, provincial, or local law.

Seniority level
  • Seniority level
    Mid-Senior level
Employment type
  • Employment type
    Full-time
Job function
  • Job function
    Consulting
  • Industries
    IT Services and IT Consulting

Referrals increase your chances of interviewing at AllCloud by 2x

Inferred from the description for this job

Medical insurance

Vision insurance

Get notified about new Architect jobs in United States.

New York City Metropolitan Area $75,000.00-$95,000.00 1 week ago

United States $75,000.00-$95,000.00 2 months ago

United States $122,000.00-$169,000.00 6 days ago

United States $75,000.00-$125,000.00 1 day ago

New York, NY $110,000.00-$140,000.00 3 weeks ago

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Software Architect

Javelin

San Francisco

Remote

USD 150.000 - 200.000

Yesterday
Be an early applicant

Senior Lead, Security Architect

Net Impact

Illinois

Remote

USD 130.000 - 230.000

4 days ago
Be an early applicant

ML Solutions Architect

Symbl.ai

Remote

USD 130.000 - 160.000

4 days ago
Be an early applicant

Customer Solutions Architect

techolution

Remote

USD 150.000 - 215.000

24 days ago

Data & Automation Architect (Remote)

CrowdStrike

Remote

USD 120.000 - 150.000

2 days ago
Be an early applicant

Salesforce Architect

Pacaso

Remote

USD 105.000 - 165.000

18 days ago

Generative AI Architect

RCH Solutions

Boston

Remote

USD 100.000 - 200.000

Today
Be an early applicant

Presales Solutions Architect

techolution

Remote

USD 125.000 - 187.000

Today
Be an early applicant

Software Solution Architect

Richardson

Remote

USD 120.000 - 150.000

2 days ago
Be an early applicant