AllCloud Logo

AllCloud

LLM Architect

Posted 11 Days Ago
Be an Early Applicant
Remote
Hiring Remotely in United States
Senior level
Remote
Hiring Remotely in United States
Senior level
The LLM Architect will design and develop custom language models, optimizing transformer architectures, implementing training methodologies, and collaborating with engineers to create advanced AI models.
The summary above was generated by AI
Description

LLM Architect

Location: US / Canada (Eastern Time) - Home based

Job Type: Full-time, Permanent 

About AllCloud

AllCloud is a global professional services company providing organizations with cloud enablement and transformation tools. As an AWS Premier Consulting Partner and audited MSP, a Salesforce Platinum Partner, and a Snowflake Premier Partner, AllCloud helps clients connect their front and back offices by building a new operating model to harness the benefits of cloud technology and data and analytics.

Job Summary

We are looking for an innovative LLM Architect to lead the design and development of custom language models at AllCloud. This role will be responsible for architecting, training, and optimizing large language models based on modified transformer architectures. The ideal candidate will have deep expertise in NLP, transformer model design, and efficient training methodologies. You'll work alongside GPU Engineers and ML Engineers to create state-of-the-art language models that meet our customers' specific requirements, pushing the boundaries of what's possible with generative AI.

Responsibilities

  • Design custom transformer-based language model architectures tailored to specific use cases
  • Develop and implement modifications to transformer architectures to enhance performance, efficiency, or capabilities
  • Create and execute model pre-training, fine-tuning, and evaluation strategies
  • Implement techniques like quantization, pruning, and knowledge distillation to optimize model size and performance
  • Design and implement training data pipelines, including data selection, cleaning, and augmentation
  • Establish rigorous evaluation frameworks to assess model performance, fairness, and safety
  • Research and implement state-of-the-art techniques in LLM development
  • Create detailed documentation on model architectures, training methodologies, and performance characteristics
  • Collaborate with GPU Engineers to implement efficient training strategies across distributed systems
  • Work with customers to understand their unique requirements and translate them into model design decisions

Requirements

Summary of Key Requirements

  • 4+ years of experience in deep learning research or development with a focus on NLP and transformer models
  • Strong understanding of transformer architecture and its variants (GPT, BERT, T5, etc.)
  • Experience designing and training large language models from scratch
  • Expertise in PyTorch or TensorFlow for implementing custom model architectures
  • Knowledge of distributed training approaches for large models (DeepSpeed, Megatron, etc.)
  • Experience with model compression techniques (quantization, pruning, knowledge distillation)
  • Strong background in mathematics, particularly linear algebra, differential equations, probability, and statistics
  • Familiarity with current research in LLM development, including attention mechanisms, mixture of experts, and efficient training methods
  • Master's or PhD in Computer Science, Machine Learning, or related field
  • Publication record in NLP, LLMs, or transformer architecture (strongly preferred)

Certifications

  • AWS Machine Learning Specialty (Strongly Preferred)
  • NVIDIA-Certified Associate - Generative AI Multimodal (Preferred)

Why work for us? 

Our team inspires progress in each other and in our customers through our relentless pursuit of excellence; you will work with leaders who promote learning and personal development.


AllCloud is an Equal Opportunity Employer and considers applicants for employment without regard to race, color, religion, sex, orientation, national origin, age, disability, genetics or any other basis forbidden under federal, provincial, or local law.


Top Skills

Deepspeed
Megatron
Nlp
PyTorch
TensorFlow
Transformer Models

Similar Jobs

6 Days Ago
Remote
Hybrid
3 Locations
Senior level
Senior level
Software
The role focuses on performance analysis, modeling, and validation for deep learning systems, enhancing both hardware and software responsiveness.
Top Skills: CC++CudaDeep LearningLarge Language ModelsLlvmMlirRisc-V
23 Days Ago
Remote
2 Locations
Mid level
Mid level
Security • Cybersecurity
The Engineer will support design projects, conduct facility assessments, develop technical documentation, and interface with clients. Responsibilities include project planning, document standards compliance, and troubleshooting engineering issues.
Top Skills: AutocadCost Estimating SoftwareEngineering Computer Scheduling Software
23 Days Ago
Remote
State Road, IL, USA
58K-118K Annually
Mid level
58K-118K Annually
Mid level
Information Technology • Consulting • Defense
The Lifecycle Engineer supports engineering life cycle management for military ships, providing technical support, developing maintenance plans, and assisting with reporting and documentation.
Top Skills: Electrical EngineeringMarine EngineeringMechanical EngineeringNaval Architecture

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account