Skip to main content
NVIDIA CUDA CUDA (Compute Unified Device Architecture) is the soul of NVIDIA GPUs — it transforms a graphics card from a gaming device into a massively parallel supercomputer. Combined with cuDNN (CUDA Deep Neural Network Library), we can leverage GPU-accelerated model training in PyTorch or TensorFlow. This guide walks through correctly installing and configuring these core components on Windows. Once you understand the configuration, the same process applies to a Linux environment with minor adjustments.
Why does this matter?Environment configuration is the first hurdle in AI development. Version mismatches are one of the most time-consuming problems developers and researchers encounter. Getting the installation right once saves countless hours of debugging RuntimeError down the road.

1. Updating the GPU Driver

Before installing CUDA, having a healthy driver version is a prerequisite.
  1. Check system information Right-click on an empty area of the desktop, open the NVIDIA Control Panel, and click “System Information” in the lower-left corner. NVIDIA Control Panel System Information
  2. Confirm the current version In System Information, you can see the current driver version and the maximum supported CUDA version. Driver version information
  3. Download the latest driver Go to the NVIDIA Driver Downloads page. Update driver
    Studio Driver or Game Ready?
    • Game Ready Driver: Optimized for the latest games, updated frequently.
    • Studio Driver: Optimized for creative software (Blender, Premiere) and stability.
    Recommendation: For deep learning development, either works. If you don’t play cutting-edge AAA titles regularly, the Studio Driver is generally more stable.
  4. Install and restart Run the installer after downloading. Screen flickering and brief blackouts during installation are normal. After installation, be sure to restart your computer to apply the changes. Driver installation screen
  5. Verify the driver After restarting, verify that the driver version is correct. Driver verification screen

2. Installing the CUDA Toolkit

The CUDA Toolkit includes compilers, development tools, and runtime libraries. For the official installation procedure, refer to the CUDA Installation Guide for Microsoft Windows.
Overview
  • CUDA (Compute Unified Device Architecture) is a parallel computing platform and programming model developed by NVIDIA. It harnesses the computational power of NVIDIA GPUs to accelerate scientific computing, image processing, machine learning, and more. It achieves high-performance parallel processing by distributing computational tasks across thousands of GPU cores.
  • cuDNN (CUDA Deep Neural Network Library) is a high-performance library optimized by NVIDIA for deep learning, built on top of CUDA.
  1. Confirm version requirements Newer is not always better! First check which versions your deep learning framework (PyTorch/TensorFlow) supports.

    This example uses CUDA 12.8.1 as a demonstration.

    Note that you must confirm which versions PyTorch supports.

    CUDA Toolkit download options list
  2. Choose the installer type On the download page, you will see two installer types:
    TypeNetwork (exe)Local (exe)
    File sizeSmall (~13.7 MB)Large (~3.3 GB)
    Installation methodDownloads components during installation (requires internet)Offline installation (includes all components)
    RecommendationFine if your connection is fastRecommended (avoids interrupted installations)
    CUDA Toolkit download options
  3. Run the installer Execute the downloaded .exe file. For most users, selecting “Express” installation is sufficient. If you are an advanced user who needs multiple CUDA versions to coexist, choose “Custom” and deselect the Driver component (since you already installed the latest driver in step 1). CUDA installation options
  4. Verify environment variables The installer normally adds environment variables automatically. Check whether CUDA_PATH is present:
    1. Search for “Edit the system environment variables”
    2. Click “Environment Variables”
    3. Look for CUDA_PATH under system variables
    Environment variable check If it is missing, check whether the folder exists and add the following paths manually:
    Variable namePath
    CUDA_PATHC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8
    CUDA_PATH_V12_8C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8
    Open a terminal (PowerShell / CMD) and run the verification command:
    nvcc -V
    
    If you see output similar to Cuda compilation tools, release 12.x, V12.x.x, the installation was successful. Verification output

3. Configuring cuDNN

cuDNN is the critical library that accelerates neural network computation. It is not an executable — it is a set of binary files that must be placed manually.
  1. Download cuDNN Go to the cuDNN Archive (requires an NVIDIA Developer account login). Download the Zip archive that corresponds to your CUDA version (12.x for CUDA 12.x). The cuDNN version number (e.g., cuDNN v8.x.x) refers to NVIDIA’s versioning of the library itself. cuDNN Archive page cuDNN download page
  2. Extract and copy files Extract the downloaded zip file. You will see folders named bin, include, and lib. You need to copy these files into the CUDA installation directory. Default CUDA path: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.x Perform the following copy operations in order (the system may prompt for administrator privileges):
    • Extracted folder/
      • bin/
        • cudnn.dll* ➡️ Copy to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.x\bin Copy
      • include/
        • cudnn.h* ➡️ Copy to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.x\include Copy
      • lib/
        • x64/
          • cudnn.lib* ➡️ Copy to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.x\lib\x64 Copy
    • bin\cudnn*.dllC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.x\bin
    • include\cudnn*.hC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.x\include
    • lib\x64\cudnn*.libC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.x\lib\x64
    Where exactly should files go?Make sure you are merging files into the existing folders — do not replace or delete the original folders themselves!
  3. Check cuDNN environment variables Ensure that C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.x\bin has been added to the system Path environment variable. C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.x\lib\x64 does not need to be added to the system Path (it is handled by the compiler configuration).
    PathPurposeAdd to PATH?
    ...\CUDA\v12.x\binLoad .dll files at runtimeYes
    ...\CUDA\v12.x\lib\x64Link .lib files at compile timeNo (handled by compiler configuration)
    • Ensure the runtime can locate .dll files (add bin to PATH).
    • Ensure the compiler can locate .lib files (handled by compiler configuration).
    • If you need to specify the lib\x64 path explicitly, configure it in your compiler settings (e.g., the library path in Visual Studio).
    Deployment noteThis procedure covers rapid deployment of CUDA and cuDNN. For more advanced configurations — such as running multiple CUDA versions side by side or using Conda environments — you can adjust the paths and management approach accordingly. For the most up-to-date information, refer to the NVIDIA official documentation.

4. Verification and Testing

The environment is set up — but does it actually work? Let’s verify with Python. Run nvidia-smi in a terminal to see the maximum CUDA version supported by your driver (e.g., CUDA Version: 12.3).

Create a test environment

Using Conda to isolate the environment is recommended, to avoid conflicts with the system Python.
# Create a Python 3.11 environment
conda create -n test_gpu python=3.11 -y

# Activate the environment
conda activate test_gpu

# Install PyTorch (adjust the version according to the official instructions)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128
CUDA version compatibilityCUDA supports backward compatibility. See the FAQ — Does a newer driver break older CUDA versions?.

Run a test script

Create a gpu_test.py file and execute it:
gpu_test.py
import torch
import torch.nn as nn

print("="*40)
print(f"PyTorch Version: {torch.__version__}")
print(f"CUDA Available:  {torch.cuda.is_available()}")

if torch.cuda.is_available():
    device = torch.device("cuda")
    print(f"Device Name:     {torch.cuda.get_device_name(device)}")
    print(f"CUDA Version:    {torch.version.cuda}")
    
    # Simple convolution test for cuDNN
    try:
        model = nn.Conv2d(1, 1, 3).to(device)
        data = torch.randn(1, 1, 64, 64).to(device)
        out = model(data)
        print("✓ cuDNN Test Passed! Convolution executed successfully.")
    except Exception as e:
        print(f"✗ cuDNN Error: {e}")
else:
    print("✗ CUDA not detected.")
print("="*40)
If the output contains ✓ cuDNN Test Passed!, congratulations — your GPU computing environment is ready.

Troubleshooting

Q: nvcc -V command not found?

A: Confirm that C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.x\bin has been added to the Path system environment variable.

Q: PyTorch raises AssertionError: Torch not compiled with CUDA enabled

A: You likely installed the CPU-only version of PyTorch. Run pip uninstall torch to remove it, then reinstall using --index-url to point to the CUDA-enabled wheel source.

Q: Does a newer driver break older CUDA versions?

A: Generally, no. NVIDIA drivers have backward compatibility. A newer driver (e.g., 550.xx) can run programs compiled with an older CUDA Toolkit (e.g., 11.8). The reverse — running a newer CUDA toolkit with an older driver — is not supported. For detailed compatibility rules (Minor Version Compatibility, Forward Compatibility, etc.), refer to the CUDA Version Compatibility documentation.