Set up Ollama on macOS
This guide walks you through the main steps of setting up an Ollama server for use in GPT for Work on macOS.
This guide assumes that you use GPT for Work on the same machine that hosts Ollama.
Prerequisites
Local machine with enough processing power and memory to run LLMs (see the Ollama documentation for recommendations)
macOS 11 Big Sur or newer
Mac user account with administrator (sudo) privileges
Installed software:
To set up the Ollama server on macOS:
(Optional) Enable HTTPS for the server.
Install the Ollama server
Download and run the macOS installer. Follow the on-screen instructions to complete the installation.
The installer starts the Ollama server in the background and sets the server to start automatically on system boot. The installer also installs the Ollama desktop application for easily starting and stopping the server.
tipYou can manually start a new instance of the Ollama server in a terminal by running
ollama serve
. However, to avoid port conflicts, make sure no other Ollama server instances are running at the same time.Open Terminal and verify that the server is running:
curl http://localhost:11434
Response if the server is running:
You have installed the Ollama server.
Install a model
Install a model from the Ollama library:
ollama pull <model-name>
For example, to install the Llama 3.2 3B model:
ollama pull llama3.2
After the installation completes, the model is available for prompting on the Ollama server.
Verify that the model works:
curl http://localhost:11434/api/generate -d '{ "model": "llama3.2", "prompt": "Who are you?", "stream": false }'
The above request asks the server to generate a response for the specified prompt with the specified model and to return the response in a single reply. If the model works, the server returns a JSON object containing the response and metadata about the response.
You have installed the model on the Ollama server. For more information about working with models, see the Ollama documentation.
Enable CORS for the Ollama server
By default, the Ollama server only accepts same-origin requests. Since GPT for Work always has a different origin from the Ollama server, you must enable cross-origin resource sharing (CORS) for the server using the OLLAMA_ORIGINS
environment variable.
To enable CORS for the Ollama server:
Set
OLLAMA_ORIGINS
with the origins that are allowed to access the server:# Set a single origin
launchctl setenv OLLAMA_ORIGINS "<ORIGIN>"
# Set multiple origins
launchctl setenv OLLAMA_ORIGINS "<ORIGIN_1>, <ORIGIN_2>, ..."For example:
Allow any origin to make requests to the server:
launchctl setenv OLLAMA_ORIGINS "*"
Allow only GPT for Work to make requests to the server:
launchctl setenv OLLAMA_ORIGINS "https://excel-addin.gptforwork.com"
Restart the server for the variable to take effect.
Verify that the
/api/tags
endpoint of the server works:curl http://localhost:11434/api/tags
GPT for Work uses the endpoint to fetch a list of models installed on the server. If the endpoint works, the server returns a JSON object with a
models
property listing all currently installed models:
You have enabled CORS for the Ollama server.
You have completed the minimum setup required to access the Ollama server from GPT for Work on the same machine. You can now set http://localhost:11434
as the Ollama server URL in GPT for Work provided the add-in is not running on Safari or on Microsoft Excel or Word.
If you use GPT for Work on Safari, or on Microsoft Excel or Word, enable HTTPS for the Ollama server.
Enable HTTPS for the Ollama server
The Ollama server uses HTTP to serve models, while GPT for Work runs on HTTPS. By default, therefore, any request from GPT for Work to the server is a mixed-content request (HTTP vs. HTTPS). Modern web browsers do not allow mixed content, as a rule. The only exceptions are mixed-content requests to http://127.0.0.1
and http://localhost
, which most browsers treat as safe and therefore allow. Safari, however, blocks mixed-content requests even on the current machine.
To avoid mixed content, you must make the Ollama server accessible over HTTPS if you use GPT for Work from:
Excel Online or Word Online on Safari
Excel for Mac (uses Safari)
Word for Mac (uses Safari)
You make the Ollama server accessible over HTTPS by setting up a reverse proxy that hides the server behind an HTTPS interface. This guide uses nginx to set up the reverse proxy.
You can also use Cloudflare Tunnel, ngrok, or a similar cloud-based tunneling service to set up HTTPS with minimal configuration. Note that with a cloud-based service your traffic will be routed through an external service over the internet.
To set up the reverse proxy:
Install nginx:
brew install nginx
Create a self-signed SSL certificate for the Ollama server:
Change to the nginx configuration directory (varies based on your Mac version):
# Mac with an Apple silicon processor
cd /opt/homebrew/etc/nginx
# Mac with an Intel processor
cd /usr/local/etc/nginxGenerate the certificate:
openssl req \
-x509 \
-newkey rsa:2048 \
-nodes \
-sha256 \
-days 365 \
-keyout ollama.key \
-out ollama.crt \
-subj '/CN=localhost' \
-extensions extensions \
-config <(printf "[dn]\nCN=localhost\n[req]\ndistinguished_name=dn\n[extensions]\nsubjectAltName=DNS:localhost\nkeyUsage=digitalSignature\nextendedKeyUsage=serverAuth")The command generates two files in the current directory:
ollama.crt
: Public self-signed certificateollama.key
: Private key used to sign the certificate
Add the certificate to trusted root certificates:
sudo security add-trusted-cert -d -r trustRoot -k /Library/Keychains/System.keychain ./ollama.crt
Create a site configuration for the Ollama server:
Open the
nginx.conf
file in your preferred text editor. The file is in the nginx configuration directory:# Mac with an Apple silicon processor
/opt/homebrew/etc/nginx/nginx.conf
# Mac with an Intel processor
/usr/local/etc/nginx/nginx.confReplace the content of the file with the following configuration:
events {}
http {
server {
listen 11435 ssl;
server_name localhost;
http2 on;
ssl_certificate ollama.crt;
ssl_certificate_key ollama.key;
location / {
proxy_pass http://localhost:11434;
# Keep connection to backend alive.
proxy_http_version 1.1;
proxy_set_header Connection '';
}
}
}The configuration uses port
11435
for HTTPS. Requests tohttps://localhost:11435
are forwarded to the Ollama server running athttp://localhost:11434
.Save and close the file.
Test the nginx configuration:
sudo nginx -t
Response if the configuration is valid:
Start nginx:
sudo nginx
Verify that the
/api/tags
endpoint of the Ollama server works over the HTTPS connection:curl -k https://localhost:11435/api/tags
GPT for Work uses the endpoint to fetch a list of models installed on the server. If the endpoint works, the server returns a JSON object with a
models
property listing all currently installed models:
You have enabled HTTPS for the Ollama server.
You have completed the setup required to access the Ollama server from GPT for Work on the same machine. You can now set https://localhost:11435
as the Ollama server URL in GPT for Work, including when the add-in is running on Safari or on Microsoft Excel or Word.