Labels

Linux (6) OpenCV (4) Deep Learning (3) MATLAB (3) Mac OS X (3) Windows (2) C# (1) Node JS (1)

2017年3月27日 星期一

Build a Nvidia CUDA server with Ubuntu 16.04 in 4 steps



*First of all, if you start from a brand new server, I suggest to install Ubuntu 16.04 WITHOUT NVIDIA graphics cards first. This will prevent Ubuntu from automatically installing open-source NVIDIA driver Nouveau. Nouveau may cause issues like black screen, lightdm crash, ..., to name a few. It's highly possible that you will see NOTHING at the inital boot if you install Ubuntu 16.04 directly with NVIDIA cards.

Once you can login the Ubuntu server, install CUDA following the 4 steps below:

1. Disable Nouveau
If you are running Desktop version, enter the terminal screen by typing
Ctrl+Alt+F1

Open or create "blacklist-nouveau.conf":

sudo vim /etc/modprobe.d/blacklist-nouveau.conf

Add following commands to the file:

blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off

Then build the new kernel:

sudo update-initramfs -u


2. Install NVIDIA driver
You can try install NVIDIA driver directly:

sudo apt-get install nvidia-375

*If apt-get cannot find nvidia driver
we need to add the ppa manually. The commands below are referred from here:
Although you can also install the drivers included in the CUDA toolkit. I suggest to install from Ubuntu ppa:
sudo apt-add-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo service lightdm stop
sudo apt-get purge nvidia-*
sudo apt-get install nvidia-375


Once the driver is installed, reboot your system, then test the driver by typing:

nvidia-smi

And you will see the NVIDIA cards installed in your system:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 0000:02:00.0      On |                  N/A |
|  0%   43C    P8     8W / 200W |    294MiB /  8107MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1080    Off  | 0000:82:00.0     Off |                  N/A |
|  0%   38C    P8     8W / 200W |      1MiB /  8114MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                            
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      1430    G   /usr/lib/xorg/Xorg                             144MiB |
|    0      2549    G   /usr/bin/compiz                                148MiB |
+-----------------------------------------------------------------------------+



3. Install CUDA Toolkit
Download CUDA toolkit from NVIDIA official site:
https://developer.nvidia.com/cuda-downloads

Remember to select runfile (local)

sudo ./cuda_8.0.61_375.26_linux.run --override

As we already installed NVIDIA driver, we choose "NOT" to install driver this time:
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 361.62?
(y)es/(n)o/(q)uit: n


Because Ubuntu 16.04 has latest gcc 6, which is not supported by CUDA. The --override flag force the installer to ignore unsupported gcc version.

Once installation is done, you may notice there is a warning:
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 361.00 is required for CUDA 8.0 functionality to work.
To install the driver using this installer, run the following command, replacing with the name of this run file:
sudo .run -silent -driver


Don't worry, we can ignore this message.


4. Downgrade gcc to 4.9
Finally, we need to downgrade the gcc/g++ version in Ubuntu to < 5.0. Somebody may suggest to remove check line in CUDA library's header file. Don't do this, it will cause compiler error. Downgrade the gcc with "update-alternatives":


sudo apt-get install g++-4.9 gcc-4.9 libgcc-4.9

sudo update-alternatives --remove-all gcc 
sudo update-alternatives --remove-all g++

sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 20
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 20

sudo update-alternatives --query gcc
sudo update-alternatives --query g++