Getting Started with the NVIDIA AI Stack

NVIDIA's AI ecosystem is vast. This guide helps you navigate the key components and get productive quickly.

1. CUDA Toolkit

The foundation of everything NVIDIA AI. Install CUDA Toolkit from [developer.nvidia.com](https://developer.nvidia.com/cuda-toolkit).

```bash

Check your GPU

nvidia-smi

Install CUDA (Ubuntu)

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb

sudo dpkg -i cuda-keyring_1.1-1_all.deb

sudo apt-get update

sudo apt-get -y install cuda-toolkit

```

2. NGC (NVIDIA GPU Cloud)

NGC provides pre-built, GPU-optimized containers for AI frameworks.

```bash

Pull the PyTorch container

docker pull nvcr.io/nvidia/pytorch:24.02-py3

Run with GPU access

docker run --gpus all -it nvcr.io/nvidia/pytorch:24.02-py3

```

3. TensorRT-LLM for Inference

For deploying LLMs at scale:

```bash

pip install tensorrt-llm

Convert a model to TensorRT format

trtllm-build --model_dir ./my-model --output_dir ./trt-engine

```

4. NeMo for Custom Training

Build and train custom models:

```bash

pip install nemo_toolkit[all]

Fine-tune a pre-trained LLM

python nemo_finetune.py --model nemo_llm --data my_dataset --method lora

```

5. NVIDIA NIM for Quick Deployment

Deploy models as APIs in minutes:

```bash

docker run -d --gpus all -p 8000:8000 nvcr.io/nim/meta/llama-3-8b-instruct

curl http://localhost:8000/v1/chat/completions -d '{"messages":[{"role":"user","content":"Hello"}]}'

```

Which GPU Do I Need?

| Use Case | Recommended GPU | VRAM |

|----------|----------------|------|

| Learning/Prototyping | RTX 4060 Ti | 16GB |

| Fine-tuning 7B models | RTX 4090 | 24GB |

| Fine-tuning 70B models | A100/H100 | 80GB |

| Production inference | H100/B200 | 80-192GB |

| Training foundation models | DGX cluster | Multi-node |

NVIDIA AI

Getting Started with NVIDIA AI Stack

Getting Started with the NVIDIA AI Stack

1. CUDA Toolkit

Check your GPU

Install CUDA (Ubuntu)

2. NGC (NVIDIA GPU Cloud)

Pull the PyTorch container

Run with GPU access

3. TensorRT-LLM for Inference

Convert a model to TensorRT format

4. NeMo for Custom Training

Fine-tune a pre-trained LLM

5. NVIDIA NIM for Quick Deployment

Which GPU Do I Need?