EasyManuals Logo
Home>Nvidia>Computer Hardware>DGX-2 SYSTEM

Nvidia DGX-2 SYSTEM User Manual

Nvidia DGX-2 SYSTEM
109 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #84 background imageLoading...
Page #84 background image
Using DGX-2 System in KVM Mode
DGX-2 System User Guide
84
after its state has been marked as ‘bad’ by the system, the VM will fail to start and an
appropriate error message is returned. Restarting an existing VM after a GPU fails will
result in the same failure and error message.
The following is an example of launching a VM when GPU 12 and 13 have been marked
as degraded or in a failed state.
nvidia-vm create --gpu-count 8 --gpu-index 8
ERROR: GPU 12 is in unexpected state "missing", can't use it -
BDF:e0:00.0 SXMID:13 UUID:GPU-b7187786-d894-2266-d11d-21124dc61dd3
ERROR: GPU 13 is in unexpected state "missing", can't use it -
BDF:e2:00.0 SXMID:16 UUID:GPU-9a6a6a52-c6b6-79c3-086b-fcf2d5b1c87e
ERROR: 2 GPU's are unavailable, unable to start this VM "dgx2vm-
labMon1559-8g8-15"
Note: If you attempt to launch a VM with a failed GPU before the system has
identified its failed state, the VM will fail to launch but without an error
message. If this happens, keep trying to launch the VM until the message
appears.
Restarting a VM After the System or VM Crashes
Some GPU errors may cause the VM or the system to crash.
If the system crashes, you can attempt to restart the VM.
If the VM crashes (but not the system), you can attempt to restart the VM.
Your VM should restart successfully if none of the associated GPUs failed. However, if
one or more of the GPUs associated with your VM failed, then the response depends on
whether the system has had a chance to identify the GPU as unavailable.
Failed GPU identified as unavailable
The system will return an error indicating that the GPU is missing or unavailable and
that the VM is unable to start.
Failed GPU not yet identified as unavailable
The VM crashes upon being restarted.
Restoring a System from Degraded Mode
All GPUs need to be replaced to restore the DGX-2 from degraded mode.

Table of Contents

Other manuals for Nvidia DGX-2 SYSTEM

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Nvidia DGX-2 SYSTEM and is the answer not in the manual?

Nvidia DGX-2 SYSTEM Specifications

General IconGeneral
BrandNvidia
ModelDGX-2 SYSTEM
CategoryComputer Hardware
LanguageEnglish

Related product manuals