Why Run Models Locally?

Local AI gives you complete data privacy, zero API costs after hardware investment, and no rate limits. Essential for sensitive data workloads.

Install Ollama

Download from ollama.com for Mac, Windows, or Linux. It installs as a background service.

```bash

ollama run llama3.3

ollama run phi4

ollama list

```

Ollama exposes an API at http://localhost:11434 that is compatible with the OpenAI SDK:

```javascript

import OpenAI from "openai";

const client = new OpenAI({

baseURL: "http://localhost:11434/v1",

apiKey: "ollama", // required but ignored

});

const response = await client.chat.completions.create({

model: "llama3.3",

messages: [{ role: "user", content: "Hello!" }],

});

```

- 7B models: 8GB RAM (runs on most laptops)

- 13B models: 16GB RAM

- 70B models: 48GB RAM or VRAM (M3 Max or RTX 4090)