Skip to main content

Set up Ollama on macOS

This guide walks you through the main steps of setting up an Ollama server on macOS for integration with GPT for Work.

This guide assumes that you either:

  • Run the Ollama server and use GPT for Work on the same local machine.

  • Run the Ollama server on one machine and use GPT for Work on another machine on the same local network.

Prerequisites
  • Local machine with enough processing power and memory to run LLMs (see the Ollama documentation for recommendations)

  • macOS 11 Big Sur or newer

  • Mac user account with administrator (sudo) privileges

  • Installed software:

To set up the Ollama server on macOS:

  1. Install the server.

  2. Install a model on the server.

  3. Enable CORS for the server.

  4. (Optional) Configure access to the server.

  5. (Optional) Enable HTTPS for the server.

Install the Ollama server

  1. Download and run the macOS installer. Follow the on-screen instructions to complete the installation.

    The installer starts the Ollama server in the background and sets the server to start automatically on system boot. The installer also installs the Ollama desktop application for easily starting and stopping the server.

    tip

    You can manually start a new instance of the Ollama server in a terminal by running ollama serve. However, to avoid port conflicts, make sure no other Ollama server instances are running at the same time.

  2. Open Terminal and verify that the server is running:

    curl http://localhost:11434

    Response if the server is running:

    Ollama is running

You have installed the Ollama server.

Install a model

  1. Install a model from the Ollama library:

    ollama pull <model-name>

    For example, to install the Mistral model:

    ollama pull mistral

    After the installation completes, the model is available for prompting on the Ollama server.

  2. Verify that the model works:

    curl http://localhost:11434/api/generate -d '{ "model": "mistral", "prompt": "Who are you?", "stream": false }'

    The above request asks the server to generate a response for the specified prompt with the specified model and to return the response in a single reply. If the model works, the server returns a JSON object containing the response and metadata about the response.

You have installed the model on the Ollama server. For more information about working with models, see the Ollama documentation.

Enable CORS for the Ollama server

By default, the Ollama server only accepts same-origin requests. Since GPT for Work always has a different origin from the Ollama server, you must enable cross-origin resource sharing (CORS) for the server using the OLLAMA_ORIGINS environment variable.

To enable CORS for the Ollama server:

  1. Set OLLAMA_ORIGINS with the origins that are allowed to access the server:

    # Set a single origin
    launchctl setenv OLLAMA_ORIGINS "<ORIGIN>"

    # Set multiple origins
    launchctl setenv OLLAMA_ORIGINS "<ORIGIN_1>, <ORIGIN_2>, ..."

    For example:

    • Allow any origin to make requests to the server:

      launchctl setenv OLLAMA_ORIGINS "*"
    • Allow only GPT for Work to make requests to the server:

      launchctl setenv OLLAMA_ORIGINS "https://excel-addin.gptforwork.com"
  2. Restart the server for the variable to take effect.

You've completed the minimum setup required by the Ollama server. You can now use http://localhost:11434 as the Ollama endpoint in GPT for Work.

note

Configure access to the Ollama server

By default, the Ollama server binds to IP address 127.0.0.1 and port 11434, which resolve to localhost:11434 on most machines. This means that you can only access the server from the same machine where the server is running.

If you want to access the server from other machines on the local network, or if you want to change the port, you must rebind the server using the OLLAMA_HOST environment variable.

To rebind the Ollama server:

  1. Set OLLAMA_HOST with an IP address and optionally port:

    # Set IP address
    launchctl setenv OLLAMA_HOST "<IP_ADDRESS>"

    # Set IP address and port
    launchctl setenv OLLAMA_HOST "<IP_ADDRESS>:<PORT_NUMBER>"

    For example:

    • Allow all machines on the local network to access the server on the default port:

      launchctl setenv OLLAMA_HOST "0.0.0.0"
    • Allow all machines on the local network to access the server on port 11535:

      launchctl setenv OLLAMA_HOST "0.0.0.0:11535"
    • Allow only localhost access on port 11535:

      launchctl setenv OLLAMA_HOST "127.0.0.1:11535"
  2. Restart the server for the variable to take effect.

Enable HTTPS for the Ollama server

The Ollama server uses HTTP to serve models, while GPT for Work runs on HTTPS. By default, therefore, any request from GPT for Work to the server is a mixed-content request (HTTP vs. HTTPS). Modern web browsers do not allow mixed content. The only exceptions are mixed-content requests to http://127.0.0.1 and http://localhost, which most browsers treat as safe and therefore allow. Safari, however, blocks mixed-content requests even on the current machine.

To avoid mixed content, you must make the Ollama server accessible over HTTPS in the following cases:

  • The server runs on the same machine as GPT for Work, and you use GPT for Work from:

    • Excel Online or Word Online on Safari

    • Excel for Mac (uses Safari)

    • Word for Mac (uses Safari)

  • The server runs on a different machine than GPT for Work.

You make the Ollama server accessible over HTTPS by setting up a reverse proxy that hides the server behind an HTTPS interface. This guide uses nginx to set up the reverse proxy.

tip

You can also use Cloudflare Tunnel, ngrok, or a similar cloud-based tunneling service to set up HTTPS with minimal configuration. Note that with a cloud-based service your traffic will be routed through an external service over the internet.

To set up the reverse proxy:

  1. Set up nginx:

    1. Install nginx:

      brew install nginx
    2. Verify that the nginx web server is running by opening http://localhost. You should see the nginx welcome page.

  2. Set up a self-signed SSL certificate for the Ollama server:

    1. Change to the nginx configuration directory (varies based on your Mac version):

      # Mac with an Apple silicon processor
      cd /opt/homebrew/etc/nginx

      # Mac with an Intel processor
      cd /usr/local/etc/nginx
    2. Generate the certificate:

      openssl req \
      -x509 \
      -newkey rsa:2048 \
      -nodes \
      -sha256 \
      -days 365 \
      -keyout ollama.key \
      -out ollama.crt \
      -subj '/CN=localhost' \
      -extensions extensions \
      -config <(printf "[dn]\nCN=localhost\n[req]\ndistinguished_name=dn\n[extensions]\nsubjectAltName=DNS:localhost\nkeyUsage=digitalSignature\nextendedKeyUsage=serverAuth")

      The command generates two files in the current directory:

      • ollama.crt: Public self-signed certificate

      • ollama.key: Private key used to sign the certificate

    3. Add the certificate to trusted root certificates:

      sudo security add-trusted-cert -d -r trustRoot -k /Library/Keychains/System.keychain ./ollama.crt
  3. Create a site configuration for the Ollama server:

    1. Open the nginx.conf file in your preferred text editor. The file is in the nginx configuration directory:

      # Mac with an Apple silicon processor
      /opt/homebrew/etc/nginx/nginx.conf

      # Mac with an Intel processor
      /usr/local/etc/nginx/nginx.conf
    2. Replace the content of the file with the following configuration:

      events {}

      http {
      server {
      listen 11435 ssl;
      server_name localhost;

      http2 on;

      ssl_certificate ollama.crt;
      ssl_certificate_key ollama.key;

      location / {
      proxy_pass http://localhost:11434;

      # Keep connection to backend alive.
      proxy_http_version 1.1;
      proxy_set_header Connection '';
      }
      }
      }

      The configuration uses port 11435 for HTTPS. Requests to https://localhost:11435 are forwarded to the Ollama server running at http://localhost:11434.

      note

      If you bound the Ollama server to a custom IP address or port, adjust http://localhost:11434 accordingly in the configuration. If you bound the server to 0.0.0.0, leave the IP address as is.

    3. Save and close the file.

    4. Test the nginx configuration:

      sudo nginx -t

      Response if the configuration is valid:

      nginx: the configuration file /opt/homebrew/etc/nginx/nginx.conf syntax is ok
      nginx: configuration file /opt/homebrew/etc/nginx/nginx.conf test is successful
    5. Start nginx:

      sudo nginx
  4. Verify that the HTTPS connection works:

    curl -k https://localhost:11435

    Response if the HTTPS connection works:

    Ollama is running
    tip

    You can also open https://localhost:11435 to verify that the HTTPS connection works.

    Ollama running over HTTPS in the browser

You're done with the setup. You can now use the Ollama server HTTPS URL as the Ollama endpoint in GPT for Work.