Set up Ollama on macOS

This guide walks you through the main steps of setting up an Ollama server for use in GPT for Work on macOS.

This guide assumes that you use GPT for Work on the same machine that hosts Ollama.

Prerequisites

Local machine with enough processing power and memory to run LLMs (see the Ollama documentation for recommendations)
macOS 11 Big Sur or newer
Mac user account with administrator (sudo) privileges
Installed software:
- curl (ships with macOS)
- Homebrew
- OpenSSL (ships with macOS)

To set up the Ollama server on macOS:

Install the server.
Install a model on the server.
Enable CORS for the server.
(Optional) Enable HTTPS for the server.

Install the Ollama server

Download and run the macOS installer. Follow the on-screen instructions to complete the installation.
The installer starts the Ollama server in the background and sets the server to start automatically on system boot. The installer also installs the Ollama desktop application for easily starting and stopping the server.
tip
You can manually start a new instance of the Ollama server in a terminal by running ollama serve. However, to avoid port conflicts, make sure no other Ollama server instances are running at the same time.
Open Terminal and verify that the server is running:
```
curl http://localhost:11434
```
Response if the server is running:
```
Ollama is running
```

You have installed the Ollama server.

Install a model

Install a model from the Ollama library:
```
ollama pull <model-name>
```
For example, to install the Llama 3.2 3B model:
```
ollama pull llama3.2
```
After the installation completes, the model is available for prompting on the Ollama server.
Verify that the model works:
```
curl http://localhost:11434/api/generate -d '{ "model": "llama3.2", "prompt": "Who are you?", "stream": false }'
```
The above request asks the server to generate a response for the specified prompt with the specified model and to return the response in a single reply. If the model works, the server returns a JSON object containing the response and metadata about the response.

You have installed the model on the Ollama server. For more information about working with models, see the Ollama documentation.

Enable CORS for the Ollama server

By default, the Ollama server only accepts same-origin requests. Since GPT for Work always has a different origin from the Ollama server, you must enable cross-origin resource sharing (CORS) for the server using the OLLAMA_ORIGINS environment variable.

To enable CORS for the Ollama server:

Set OLLAMA_ORIGINS with the origins that are allowed to access the server:

# Set a single origin
launchctl setenv OLLAMA_ORIGINS "<ORIGIN>"

# Set multiple origins
launchctl setenv OLLAMA_ORIGINS "<ORIGIN_1>, <ORIGIN_2>, ..."

For example:

Allow any origin to make requests to the server:
```
launchctl setenv OLLAMA_ORIGINS "*"
```

Allow only GPT for Work to make requests to the server:

launchctl setenv OLLAMA_ORIGINS "https://excel-addin.gptforwork.com"

Restart the server for the variable to take effect.
Verify that the /api/tags endpoint of the server works:
```
curl http://localhost:11434/api/tags
```
GPT for Work uses the endpoint to fetch a list of models installed on the server. If the endpoint works, the server returns a JSON object with a models property listing all currently installed models:
```
{
  "models": [
    {
      ...
    }
  ]
}
```

You have enabled CORS for the Ollama server.

What's next

You have completed the minimum setup required to access the Ollama server from GPT for Work on the same machine. You can now set http://localhost:11434 as the Ollama server URL in GPT for Work provided the add-in is not running on Safari or on Microsoft Excel or Word.

If you use GPT for Work on Safari, or on Microsoft Excel or Word, enable HTTPS for the Ollama server.

Enable HTTPS for the Ollama server

The Ollama server uses HTTP to serve models, while GPT for Work runs on HTTPS. By default, therefore, any request from GPT for Work to the server is a mixed-content request (HTTP vs. HTTPS). Modern web browsers do not allow mixed content, as a rule. The only exceptions are mixed-content requests to http://127.0.0.1 and http://localhost, which most browsers treat as safe and therefore allow. Safari, however, blocks mixed-content requests even on the current machine.

To avoid mixed content, you must make the Ollama server accessible over HTTPS if you use GPT for Work from:

Excel Online or Word Online on Safari
Excel for Mac (uses Safari)
Word for Mac (uses Safari)

You make the Ollama server accessible over HTTPS by setting up a reverse proxy that hides the server behind an HTTPS interface. This guide uses nginx to set up the reverse proxy.

tip

You can also use Cloudflare Tunnel, ngrok, or a similar cloud-based tunneling service to set up HTTPS with minimal configuration. Note that with a cloud-based service your traffic will be routed through an external service over the internet.

To set up the reverse proxy:

Install nginx:
```
brew install nginx
```

Create a self-signed SSL certificate for the Ollama server:

Change to the nginx configuration directory (varies based on your Mac version):

# Mac with an Apple silicon processor
cd /opt/homebrew/etc/nginx

# Mac with an Intel processor
cd /usr/local/etc/nginx

Generate the certificate:

openssl req \
  -x509 \
  -newkey rsa:2048 \
  -nodes \
  -sha256 \
  -days 365 \
  -keyout ollama.key \
  -out ollama.crt \
  -subj '/CN=localhost' \
  -extensions extensions \
  -config <(printf "[dn]\nCN=localhost\n[req]\ndistinguished_name=dn\n[extensions]\nsubjectAltName=DNS:localhost\nkeyUsage=digitalSignature\nextendedKeyUsage=serverAuth")

The command generates two files in the current directory:

ollama.crt: Public self-signed certificate
ollama.key: Private key used to sign the certificate

Add the certificate to trusted root certificates:

sudo security add-trusted-cert -d -r trustRoot -k /Library/Keychains/System.keychain ./ollama.crt

Create a site configuration for the Ollama server:

Open the nginx.conf file in your preferred text editor. The file is in the nginx configuration directory:

# Mac with an Apple silicon processor
/opt/homebrew/etc/nginx/nginx.conf

# Mac with an Intel processor
/usr/local/etc/nginx/nginx.conf

Replace the content of the file with the following configuration:

events {}

http {
    server {
        listen       11435 ssl;
        server_name  localhost;

        http2 on;

        ssl_certificate      ollama.crt;
        ssl_certificate_key  ollama.key;

        location / {
            proxy_pass  http://localhost:11434;

            # Keep connection to backend alive.
            proxy_http_version  1.1;
            proxy_set_header    Connection '';
        }
    }
}

The configuration uses port 11435 for HTTPS. Requests to https://localhost:11435 are forwarded to the Ollama server running at http://localhost:11434.

Save and close the file.

Test the nginx configuration:

sudo nginx -t

Response if the configuration is valid:

nginx: the configuration file /opt/homebrew/etc/nginx/nginx.conf syntax is ok
nginx: configuration file /opt/homebrew/etc/nginx/nginx.conf test is successful

Start nginx:
```
sudo nginx
```

Verify that the /api/tags endpoint of the Ollama server works over the HTTPS connection:
```
curl -k https://localhost:11435/api/tags
```
GPT for Work uses the endpoint to fetch a list of models installed on the server. If the endpoint works, the server returns a JSON object with a models property listing all currently installed models:
```
{
  "models": [
    {
      ...
    }
  ]
}
```

You have enabled HTTPS for the Ollama server.

What's next

You have completed the setup required to access the Ollama server from GPT for Work on the same machine. You can now set https://localhost:11435 as the Ollama server URL in GPT for Work, including when the add-in is running on Safari or on Microsoft Excel or Word.

Install the Ollama server​

Install a model​

Enable CORS for the Ollama server​

Enable HTTPS for the Ollama server​

Install the Ollama server

Install a model

Enable CORS for the Ollama server

Enable HTTPS for the Ollama server