Nvidia Model Parallelism: Building and Deploying Large Neural Networks (MPBDLNN)
Ziele der Schulung
Very large deep neural networks (DNNs), whether applied to natural language processing (e.g., GPT-3), computer vision (e.g., huge Vision Transformers), or speech AI (e.g., Wave2Vec 2) have certain properties that set them apart from their smaller counterparts. As DNNs become larger and are trained on progressively larger datasets, they can adapt to new tasks with just a handful of training examples, accelerating the route toward general artificial intelligence. Training models that contain tens to hundreds of billions of parameters on vast datasets isn’t trivial and requires a unique combination of AI, high-performance computing (HPC), and systems knowledge.
Please note that once a booking has been confirmed, it is non-refundable. This means that after you have confirmed your seat for an event, it cannot be cancelled and no refund will be issued, regardless of attendance.
In this workshop, participants will learn how to:
- Train neural networks across multiple servers
- Use techniques such as activation checkpointing, gradient accumulation, and various forms of model parallelism to overcome the challenges associated with large-model memory footprint
- Capture and understand training performance characteristics to optimize model architecture
- Deploy very large multi-GPU models to production using NVIDIA Triton™ Inference Server
Zielgruppe Seminar
This course is designed for AI researchers, machine learning engineers, and deep learning practitioners who want to train and deploy large-scale models efficiently. It is ideal for professionals working in natural language processing (NLP), generative AI, and high-performance computing (HPC) who seek to optimize large model training and inference. The course is particularly relevant for those involved in developing transformer-based models like GPT and working with distributed training frameworks.
Voraussetzungen
Familiarity with:
- Good understanding of PyTorch
- Good understanding of deep learning and data parallel training concepts
- Practice with deep learning and data parallel are useful, but optional
Lernmethodik
Die Schulung bietet Ihnen eine ausgewogene Mischung aus Theorie und Praxis in einer erstklassigen Lernumgebung. Profitieren Sie vom direkten Austausch mit unseren projekterfahrenen Trainern und anderen Teilnehmern, um Ihren Lernerfolg zu maximieren.
Seminarinhalt
Introduction to Training of Large Models
- Learn about the motivation behind and key challenges of training large models.
- Get an overview of the basic techniques and tools needed for large-scale training.
- Get an introduction to distributed training and the Slurm job scheduler.
- Train a GPT model using data parallelism.
- Profile the training process and understand execution performance.
Model Parallelism: Advanced Topics
- Increase the model size using a range of memory-saving techniques.
- Get an introduction to tensor and pipeline parallelism.
- Go beyond natural language processing and get an introduction to DeepSpeed.
- Auto-tune model performance.
- Learn about mixture-of-experts models.
Inference of Large Models
- Understand the challenges of deployment associated with large models.
- Explore techniques for model reduction.
- Learn how to use TensorRT-LLM.
- Learn how to use Triton Inference Server.
- Understand the process of deploying GPT checkpoint to production.
- See an example of prompt engineering.
Final Review
- Review key learnings and answer questions.
- Complete the assessment and earn a certificate.
- Complete the workshop survey.
Hinweise
Partner
Dieses Seminar bieten wir in Kooperation mit unserem Nvidia Learning Partner Fast Lane Institute for Knowledge Transfer GmbH an.
Open Badge für dieses Seminar - Ihr digitaler Kompetenznachweis

Durch die erfolgreiche Teilnahme an einem Kurs bei IT-Schulungen.com erhalten Sie zusätzlich zu Ihrem Teilnehmerzertifikat ein digitales Open Badge (Zertifikat) – Ihren modernen Nachweis für erworbene Kompetenzen.
Ihr Open Badge ist jederzeit in Ihrem persönlichen und kostenfreien Mein IT-Schulungen.com-Konto verfügbar. Mit wenigen Klicks können Sie diesen digitalen Nachweis in sozialen Netzwerken teilen, um Ihre Expertise sichtbar zu machen und Ihr berufliches Profil gezielt zu stärken.
Übersicht: NVIDIA Schulungen Portfolio



