Do you have a question about the Nvidia DGX H100 and is the answer not in the manual?
GPU | 8x NVIDIA H100 Tensor Core GPUs |
---|---|
GPU Memory | 640 GB HBM3 (80GB per GPU) |
System Memory | 2 TB DDR5 |
Form Factor | 6U rackmount |
Storage | 30TB NVMe SSD |
Power Supply | 10kW |
Interconnect | NVLink 4.0 |
Details the components, specifications, and physical layout of the DGX H100 system.
Explains network ports, modules, and supported cables for DGX H100 connectivity.
Lists the software components included in the DGX OS stack.
Guides on connecting to the DGX H100 console via direct or remote methods.
Describes how to establish an SSH connection to the DGX H100 operating system.
Covers the initial system setup process after powering on or reimaging.
Outlines recommended tasks after the initial system setup, like updates.
Provides requirements and instructions for initial installation and configuration.
Provides instructions for safely powering the DGX H100 system on and off.
Outlines how to use NVSM for system health checks and verification.
Details methods for providing GPU support for Docker containers.
Discusses security updates for CPU vulnerabilities and performance impact.
Guides users on how to access the system BIOS setup utility.
Provides instructions for changing the system's boot order.
Provides steps to connect to the DGX H100's BMC via a web browser.
Details the primary controls available in the BMC web interface.
Guides on how to change BMC login credentials and manage users.
Explains how to use the remote console (KVM) via the BMC.
Discusses user-level security practices for the DGX H100 system.
Covers security measures incorporated into the NVIDIA DGX H100 system.
Explains how to securely erase data from DGX H100 system SSDs.
Lists the Redfish features supported by the DGX H100 system.
Provides examples of using Redfish APIs for system management.
General safety advice for installing and maintaining the DGX H100 server.
Explains safety symbols and general warnings for personal injury.
Details important electrical safety information for the DGX H100.
Advises on safety precautions when accessing the inside of the system.
Details FCC compliance for Class A digital devices in the US.
Covers European Conformity (CE) and relevant directives.
Details CU TR and FAC compliance for the region.
Provides the license agreement terms for the Micron msecli utility.
Outlines the terms and conditions for using Mellanox OFED software.
Contains general disclaimers, warranty information, and usage restrictions.
Lists NVIDIA and other company trademarks.