Set up Ollama on macOS
This guide walks you through the main steps of setting up an Ollama server on macOS for integration with GPT for Work.
This guide assumes that you either:
Run the Ollama server and use GPT for Work on the same local machine.
Run the Ollama server on one machine and use GPT for Work on another machine on the same local network.
Prerequisites
Local machine with enough processing power and memory to run LLMs (see the Ollama documentation for recommendations)
macOS 11 Big Sur or newer
Mac user account with administrator (sudo) privileges
Installed software:
To set up the Ollama server on macOS:
(Optional) Configure access to the server.
(Optional) Enable HTTPS for the server.
Install the Ollama server
Download and run the macOS installer. Follow the on-screen instructions to complete the installation.
The installer starts the Ollama server in the background and sets the server to start automatically on system boot. The installer also installs the Ollama desktop application for easily starting and stopping the server.
tipYou can manually start a new instance of the Ollama server in a terminal by running
ollama serve
. However, to avoid port conflicts, make sure no other Ollama server instances are running at the same time.Open Terminal and verify that the server is running:
curl http://localhost:11434
Response if the server is running:
You have installed the Ollama server.
Install a model
Install a model from the Ollama library:
ollama pull <model-name>
For example, to install the Mistral model:
ollama pull mistral
After the installation completes, the model is available for prompting on the Ollama server.
Verify that the model works:
curl http://localhost:11434/api/generate -d '{ "model": "mistral", "prompt": "Who are you?", "stream": false }'
The above request asks the server to generate a response for the specified prompt with the specified model and to return the response in a single reply. If the model works, the server returns a JSON object containing the response and metadata about the response.
You have installed the model on the Ollama server. For more information about working with models, see the Ollama documentation.
Enable CORS for the Ollama server
By default, the Ollama server only accepts same-origin requests. Since GPT for Work always has a different origin from the Ollama server, you must enable cross-origin resource sharing (CORS) for the server using the OLLAMA_ORIGINS
environment variable.
To enable CORS for the Ollama server:
Set
OLLAMA_ORIGINS
with the origins that are allowed to access the server:# Set a single origin
launchctl setenv OLLAMA_ORIGINS "<ORIGIN>"
# Set multiple origins
launchctl setenv OLLAMA_ORIGINS "<ORIGIN_1>, <ORIGIN_2>, ..."For example:
Allow any origin to make requests to the server:
launchctl setenv OLLAMA_ORIGINS "*"
Allow only GPT for Work to make requests to the server:
launchctl setenv OLLAMA_ORIGINS "https://excel-addin.gptforwork.com"
Restart the server for the variable to take effect.
You've completed the minimum setup required by the Ollama server. You can now use http://localhost:11434
as the Ollama endpoint in GPT for Work.
If you want to access the Ollama server from other machines on the local network, you still need to configure access to the server and enable HTTPS for the server.
If you use GPT for Work on Safari, you still need to enable HTTPS for the server.
Configure access to the Ollama server
By default, the Ollama server binds to IP address 127.0.0.1
and port 11434
, which resolve to localhost:11434
on most machines. This means that you can only access the server from the same machine where the server is running.
If you want to access the server from other machines on the local network, or if you want to change the port, you must rebind the server using the OLLAMA_HOST
environment variable.
To rebind the Ollama server:
Set
OLLAMA_HOST
with an IP address and optionally port:# Set IP address
launchctl setenv OLLAMA_HOST "<IP_ADDRESS>"
# Set IP address and port
launchctl setenv OLLAMA_HOST "<IP_ADDRESS>:<PORT_NUMBER>"For example:
Allow all machines on the local network to access the server on the default port:
launchctl setenv OLLAMA_HOST "0.0.0.0"
Allow all machines on the local network to access the server on port 11535:
launchctl setenv OLLAMA_HOST "0.0.0.0:11535"
Allow only localhost access on port 11535:
launchctl setenv OLLAMA_HOST "127.0.0.1:11535"
Restart the server for the variable to take effect.
Enable HTTPS for the Ollama server
The Ollama server uses HTTP to serve models, while GPT for Work runs on HTTPS. By default, therefore, any request from GPT for Work to the server is a mixed-content request (HTTP vs. HTTPS). Modern web browsers do not allow mixed content. The only exceptions are mixed-content requests to http://127.0.0.1
and http://localhost
, which most browsers treat as safe and therefore allow. Safari, however, blocks mixed-content requests even on the current machine.
To avoid mixed content, you must make the Ollama server accessible over HTTPS in the following cases:
The server runs on the same machine as GPT for Work, and you use GPT for Work from:
Excel Online or Word Online on Safari
Excel for Mac (uses Safari)
Word for Mac (uses Safari)
- The server runs on a different machine than GPT for Work.
You make the Ollama server accessible over HTTPS by setting up a reverse proxy that hides the server behind an HTTPS interface. This guide uses nginx to set up the reverse proxy.
You can also use Cloudflare Tunnel, ngrok, or a similar cloud-based tunneling service to set up HTTPS with minimal configuration. Note that with a cloud-based service your traffic will be routed through an external service over the internet.
To set up the reverse proxy:
Set up nginx:
Install nginx:
brew install nginx
Verify that the nginx web server is running by opening http://localhost. You should see the nginx welcome page.
Set up a self-signed SSL certificate for the Ollama server:
Change to the nginx configuration directory (varies based on your Mac version):
# Mac with an Apple silicon processor
cd /opt/homebrew/etc/nginx
# Mac with an Intel processor
cd /usr/local/etc/nginxGenerate the certificate:
openssl req \
-x509 \
-newkey rsa:2048 \
-nodes \
-sha256 \
-days 365 \
-keyout ollama.key \
-out ollama.crt \
-subj '/CN=localhost' \
-extensions extensions \
-config <(printf "[dn]\nCN=localhost\n[req]\ndistinguished_name=dn\n[extensions]\nsubjectAltName=DNS:localhost\nkeyUsage=digitalSignature\nextendedKeyUsage=serverAuth")The command generates two files in the current directory:
ollama.crt
: Public self-signed certificateollama.key
: Private key used to sign the certificate
Add the certificate to trusted root certificates:
sudo security add-trusted-cert -d -r trustRoot -k /Library/Keychains/System.keychain ./ollama.crt
Create a site configuration for the Ollama server:
Open the
nginx.conf
file in your preferred text editor. The file is in the nginx configuration directory:# Mac with an Apple silicon processor
/opt/homebrew/etc/nginx/nginx.conf
# Mac with an Intel processor
/usr/local/etc/nginx/nginx.confReplace the content of the file with the following configuration:
events {}
http {
server {
listen 11435 ssl;
server_name localhost;
http2 on;
ssl_certificate ollama.crt;
ssl_certificate_key ollama.key;
location / {
proxy_pass http://localhost:11434;
# Keep connection to backend alive.
proxy_http_version 1.1;
proxy_set_header Connection '';
}
}
}The configuration uses port
11435
for HTTPS. Requests tohttps://localhost:11435
are forwarded to the Ollama server running athttp://localhost:11434
.noteIf you bound the Ollama server to a custom IP address or port, adjust
http://localhost:11434
accordingly in the configuration. If you bound the server to0.0.0.0
, leave the IP address as is.Save and close the file.
Test the nginx configuration:
sudo nginx -t
Response if the configuration is valid:
Start nginx:
sudo nginx
Verify that the HTTPS connection works:
curl -k https://localhost:11435
Response if the HTTPS connection works:
tipYou can also open https://localhost:11435 to verify that the HTTPS connection works.
You're done with the setup. You can now use the Ollama server HTTPS URL as the Ollama endpoint in GPT for Work.