Tuesday, August 23, 2016

nvidia - How do I properly install CUDA 8 on an Azure VM running Ubuntu 14.04 LTS?



I've tried to install CUDA on three different VMs but have been unsuccessful in getting it to recognize my GPU.




I am using an Azure VM (Standard NV6) with an M60 GPU.



With a fresh VM I run the following commands taken from this guide:



wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda-repo-ubuntu1404-8-0-local-ga2_8.0.61-1_amd64-deb

sudo dpkg -i cuda-repo-ubuntu1604-8-0-local_8.0.44-1_amd64-deb
sudo apt-get update
sudo apt-get install -y cuda



It appears to run successful and doesn't indicate that there were any problems. But when I run



nvidia-smi


I receive the following:



NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running



I have tried with 16.04 LTS and various other GPU instances. Google tells me others are using these Azure GPU instances with Tensorflow, so it doesn't appear to be an issue with the graphics card.



Finally, I have reviewed what seems to be the canonical guide to installing CUDA on Ubuntu but it fails when running



sudo ./NVIDIA-Linux-x86_64-331.62.run 


enter image description here




The message in the log file:



ERROR: Unable to load the 'nvidia-drm' kernel module.


My Question



What is the most reliable method for installing CUDA 8 on Ubuntu 14.04 LTS?



Are there any special precauations that I need to take when running CUDA on a VM?




Edit: Additional Info



uname -a returns



Linux 2017-02-21-josh-gpu 4.4.0-64-generic #85~14.04.1-Ubuntu SMP Mon Feb 20 12:10:54 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux


lsmod returns




Module                  Size  Used by
drm_kms_helper 151552 0
drm 360448 1 drm_kms_helper
syscopyarea 16384 1 drm_kms_helper
sysfillrect 16384 1 drm_kms_helper
sysimgblt 16384 1 drm_kms_helper
fb_sys_fops 16384 1 drm_kms_helper
udf 90112 0
crc_itu_t 16384 1 udf
dm_crypt 28672 0

joydev 20480 0
hid_generic 16384 0
hid_hyperv 16384 0
hid 118784 2 hid_hyperv,hid_generic
hyperv_keyboard 16384 0
hv_balloon 24576 0
input_leds 16384 0
serio_raw 16384 0
hv_netvsc 40960 0
hv_storvsc 20480 2

hv_utils 28672 2
scsi_transport_fc 65536 1 hv_storvsc
crct10dif_pclmul 16384 0
crc32_pclmul 16384 0
ghash_clmulni_intel 16384 0
hyperv_fb 20480 1
aesni_intel 167936 0
aes_x86_64 20480 1 aesni_intel
lrw 16384 1 aesni_intel
gf128mul 16384 1 lrw

glue_helper 16384 1 aesni_intel
ablk_helper 16384 1 aesni_intel
cryptd 20480 3 ghash_clmulni_intel,aesni_intel,ablk_helper
psmouse 126976 0
hv_vmbus 90112 7 hv_balloon,hyperv_keyboard,hv_netvsc,hid_hyperv,hv_utils,hyperv_fb,hv_storvsc
floppy 73728 0


The official Azure documentation points out:





Currently, Linux GPU support is only available on Azure NC
VMs running Ubuntu Server 16.04 LTS.+




I'm not sure why they even let you create GPU instances with 14.04 installed, but hopefully this will help spread the word.



After creating a fresh 16.04 instance I did the following:



First, I had to uninstall/blacklist the Nouveau drivers that come pre-installed on Ubuntu 16.04. They're not compatible with the NVIDIA drivers we're trying to install and will cause errors later on if we don't remove them.




 sudo nano /etc/modprobe.d/blacklist.conf


At the bottom of the file add the following entries:



 amd76x_edac #this might not be required for x86 32 bit users.
blacklist vga16fb
blacklist nouveau
blacklist rivafb

blacklist nvidiafb
blacklist rivatv


Reboot VM with sudo reboot



I downloaded the drivers directly from Microsoft, but you can substitute with your preferred source:



wget -O NVIDIA-Linux-x86_64-384.73-grid.run https://go.microsoft.com/fwlink/?linkid=849941  


chmod +x NVIDIA-Linux-x86_64-384.73-grid.run

sudo ./NVIDIA-Linux-x86_64-384.73-grid.run


I just clicked through the default selected options in the runfile.



Verify driver installation by running nvidia-smi



Install CUDA Toolkit 8




CUDA_REPO_PKG=cuda-repo-ubuntu1604_8.0.44-1_amd64.deb

wget -O /tmp/${CUDA_REPO_PKG} http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/${CUDA_REPO_PKG}

sudo dpkg -i /tmp/${CUDA_REPO_PKG}

rm -f /tmp/${CUDA_REPO_PKG}

sudo apt-get update


sudo apt-get install cuda-drivers

No comments:

Post a Comment

11.10 - Can't boot from USB after installing Ubuntu

I bought a Samsung series 5 notebook and a very strange thing happened: I installed Ubuntu 11.10 from a usb pen drive but when I restarted (...