January 7, 2021

[Note] CUDA reinstall with Ubuntu 20.04 and Nvidia driver 455.45.01

After upgrading Nvidia driver to 455, tensorflow-gpu stopped working and this is a note on reinstallling all the necessary s/w.

Environment

  • Xeon CPU with AVX, no BMI2
  • Ubuntu 20.04
  • Nvidia Driver 455
  • Quadro P2000 (compute capabilities=5.2)

 

What didn't work

I thought this may be an opportunity to upgrade CUDA, tenssrflow and everything else but learned it's not that simple for my PC.

Tried CUDA 11.1 + Tensorflow 2.4 = This requires CPU supporting BMI2 instructions.  Otherwise need to compile TF2.4 myself.

Downgrading the driver to 450, 440 with CUDA 10.2 (or 10.1) + TF 2.2.0 = after reboot, driver is back to 455.  Not working.


What worked 

Driver v455 + CUDA 10.2 + TF 2.2.0

And the installation was pretty simple:

1. Install v455 driver

2. Install CUDA, $ sudo apt-get install nvidia-cuda-toolkit

$ conda create -n tf python=3.7.9
$ conda activate tf
$ conda install tensorflow-gpu=1.5.0 keras

Or,

$ conda create -n tf2 python=3.7.9
$ conda activate tf2
$ conda install tensorflow-gpu=2.2.0 keras

Step #2 will install CUDA 10.2.


Jupyter

Install Jupyter

$ conda install jupyter
$ conda install -c conda-forge jupyter_nbextensions_configurator jupyter_contrib_nbextensions

$ jupyter nbextensions_configurator enable --user


When run Jupyter, noticed a lot of error messages like this:

Config option `template_path` not recognized by `ExporterCollapsibleHeadings`.  Did you mean one of: `extra_template_paths, template_name, template_paths`?

This can be fixed by downgrading nbconvert from 6.0.7 to 5.6.1:

$ conda install nbconvert=5.6.1

But Jupyter extension is still broken.  Nbextentions tab does not appear.  File->Edit-> nbextensions, I get this error:

404 GET /static/notebook/js/mathjaxutils.js?v=2021010720231

To fix this,

1. Go to conda envs directory.  e.g. $ cd ~/anaconda3/envs/tf

2. $ vi ./lib/python3.7/site-packages/jupyter_nbextensions_configurator/static/nbextensions_configurator/render/render.js

3. change 'notebook/js/mathjaxutils' to 'base/js/mathjaxutils'


Reference, 

https://discourse.jupyter.org/t/the-static-notebook-js-mathjaxutils-js-is-missing/7303/2



Helpful Commands

$ lscpu | grep -i bmi2

$ sudo lshw -C display

$ sudo ubuntu-drivers devices




No comments: