NVIDIA DGX H100 User Guide
4.3. Obtaining an NGC Account
NVIDIA NGC provides access to GPU-optimized software for deep learning, machine learning, and high-
performance computing (HPC). An NGC account grants you access to these tools and gives you the
ability to set up a private registry to manage your customized software.
If you are the organization administrator for your DGX system purchase, work with NVIDIA Enterprise
Support to set up an NGC enterprise account. Refer to the NGC Private Registry User Guide for more
information about getting an NGC enterprise account.
4.4. Turning DGX H100 On and O
DGX H100 is a complex system, integrating a large number of cutting-edge components with specic
startup and shutdown sequences. Observe the following startup and shutdown instructions.
4.4.1. Startup Considerations
To keep your DGX H100 running smoothly, allow up to a minute of idle time after reaching the login
prompt. This ensures that all components can complete their initialization.
4.4.2. Shutdown Considerations
When shutting down DGX H100, always initiate the shutdown from the operating system, momentary
press of the power button, or by using Graceful Shutdown from the BMC, and wait until the system
enters a powered-o state before performing any maintenance.
Warning: Risk of Danger - Removing power cables or using Power Distribution Units (PDUs) to shut
o the system while the Operating System is running may cause damage to sensitive components
in the DGX H100 server.
4.5. Verifying Functionality - Quick Health Check
NVIDIA provides customers a diagnostics and management tool called NVIDIA System Management, or
NVSM. The nvsm command can be used to determine the system’s health, identify component issues
and alerts, or run a stress test to make sure all components are in working order while under load. The
use of Docker is key to getting the most performance out of the system since NVIDIA has optimized
containers for all the major frameworks and workloads used on DGX systems.
The following are the steps for performing a health check on the DGX H100 System, and verifying the
Docker and NVIDIA driver installation.
1. Establish an SSH connection to the DGX H100 System.
28 Chapter 4. Quickstart and Basic Operation