Update README.md

This commit is contained in:
Will Russell
2023-04-23 11:30:15 -04:00
committed by GitHub
parent 8a597ef772
commit 14931adc4c

View File

@@ -8,31 +8,34 @@ A simple bash script for automatic install of the proper nvidia drivers on Debia
4. Reboot your machine when prompted to re-initialize your platform with the latest driver available for your host.
# This script will perform the following actions on your machine:
# The NVIDIA_drivers script will perform the following actions on your machine:
1. Check for and perform base updates (sudo apt update && sudo apt upgrade -y)
2. Add the PPA for "ubuntu-drivers" which handles NVIDIA driver installation repositories
3. Run a request to check the hardware of your NVIDIA card
4. Install the recommended driver for that card based on latest data in the PPA
4. Prompt for your selection of driver (or pressing return will select the recommended one)
5. Blacklist the NOUVEAU driver baseline for default cards (often will cause nice new GPU's to boot into a graphic error)
6. Recompile the initramfs to include the new drivers and omit the old
7. Prompt you to restart.
7. test with `nvidia-smi` for driver access, and then prompt you to restart
# Troubleshooting
There are a few scripts included for troubleshooting:
- A big undo script: `purge_nvidia.sh` ##Currently only works on ubuntu/deb/apt environments; calls dpkg/apt purge.
This script will completely remove and unload modules/drivers and associated libraries that are neccessary for nvidia to operate. Generally, you will want to run this ONLY if you are unable to figure out what else is stuck on your computer preventing the proper execution of nvidia drivers, but it generally will un-stick your issue if it's because there's a leftover package somewhere that hasn't been properly removed before reinstallation.
**warning: This script execution will leave your machine without a valid driver selection and may result in an unstable/non-graphical environment. DO NOT FORGET TO REINSTALL A DRIVER (unless you want to debug further in multi-user.target instead of graphical.target, which is fine). It's best to run this from multi-user or secondary TTY session to avoid removal of gpu driver being hung because it is in use by the desktop/gui also.
# Other Scripts and what they do:
- A MOK management script: `MOK_Enroll.sh`
generally you shouldn't need to run this, but if you find that you've enabled secureboot on your machine, and your MOK key wasn't enrolled, you'll have an error initializing your new drivers. the MOK key script just kicks a prompt for you to register your SecureBoot Key and restarts your machine - it'll ask you for a new secureboot password and will create a signature file that will be added to MOK manager after a restart.
- remove_tool.sh: Removes any and all `nvidia` associated packages with a for-loop running through `dpkg -l`. Very helpful for driver version rollbacks. You should basically be able to just run this and remove nvidia packages, then change to an earlier driver version if you find that a newer driver breaks your setup.
Run that MOK script only if you find after running the NVIDIA_drivers script your machine seems to keep coming up with graphics driver issues or very low resolution.
A good tip off that this is the case, is if during NVIDIA driver installations, it asks you to assign a secureboot passphrase. This means that the files aren't automatically being signed by your secureboot setup key, and they'll fail to initialize on next restart.
- `purge_nvidia.sh` A much more involved big undo script:
[[Currently only works on ubuntu/deb/apt environments; calls dpkg/apt purge.]]
This script will completely remove and unload modules/drivers and associated libraries that are neccessary for nvidia to operate. Generally, you will want to run this ONLY if you are unable to figure out what else is stuck on your computer preventing the proper execution of nvidia drivers, but it generally will un-stick your issue if it's because there's a leftover package somewhere that hasn't been properly removed before reinstallation. This is more heavy-handed than `remove_tool.sh` but it ought to cure what ails the system if you are really in a bind. DO NOT FORGET TO REINSTALL A DRIVER afterwards (re-run `NVIDIA_drivers.sh` -- unless you want to debug further in multi-user.target instead of graphical.target, which is fine). It's best to run this from multi-user or secondary TTY session to avoid removal of gpu driver being hung because it is in use by the desktop/gui also.
- A script for Driver mismatch to address the `NVML: Driver/library version mismatch` error: `driver_mismatch.sh`
- `docker_gpu_setup.sh`: This script will place the `nvidia-container-runtime-toolkit` and `nvidia-container-runtime-hook` packages to ensure that docker can communicate with your GPUS. Run this script to enable containerized GPU passthrough for training runs
- `install_cuda.md`: This is steps for installing CUDA library on your machine; not scripted because I assume you'll have particular requirements for which build you need, but should be self-explanatory
- `driver_mismatch.sh` A script for Driver mismatch to address the `NVML: Driver/library version mismatch` error:
Run this script if you find that the above error messages are presented after driver installation. (Note that a reboot should also sufficiently resolve this problem, so start there first).
- `MOK_Enroll.sh` A MOK management script:
generally you shouldn't need to run this, but if you find that you've enabled secureboot on your machine, and your MOK key wasn't enrolled, you'll have an error initializing your new drivers. the MOK key script just kicks a prompt for you to register your SecureBoot Key and restarts your machine - it'll ask you for a new secureboot password and will create a signature file that will be added to MOK manager after a restart. Run that MOK script only if you find after running the NVIDIA_drivers script your machine seems to keep coming up with graphics driver issues or very low resolution.
# GENERAL Troubleshooting NVIDIA graphics problems: