Skip to main content

Set up Ollama on Windows

This guide walks you through the main steps of setting up an Ollama server for use in GPT for Work on Windows.

This guide assumes that you use GPT for Work on the same machine that hosts Ollama.

Prerequisites
  • Local machine with enough processing power and memory to run LLMs (see the Ollama documentation for recommendations)

  • Windows 10 or newer (see the Ollama documentation for detailed system requirements)

  • Windows user account with administrator privileges

  • Installed software:

To set up the Ollama server on Windows:

  1. Install the server.

  2. Install a model on the server.

  3. Enable CORS for the server.

Install the Ollama server

  1. Download and run the Windows installer. Follow the on-screen instructions to complete the installation.

    The installer starts the Ollama server in the background and sets the server to start automatically on system boot. The installer also installs the Ollama desktop application for easily starting and stopping the server.

    tip

    You can manually start a new instance of the Ollama server in a terminal by running ollama serve. However, to avoid port conflicts, make sure no other Ollama server instances are running at the same time.

  2. Open PowerShell as an administrator and verify that the server is running:

    curl http://localhost:11434

    Response if the server is running:

    Ollama is running
    note

    Make sure you're using curl and not an aliased command, such as Invoke-WebRequest:

    Remove-Item alias:curl

    If curl has an alias, the command removes it for the current terminal session, ensuring that subsequent curl commands in this guide work as expected.

You have installed the Ollama server.

For more information about installing and managing the server on Windows, see the Ollama documentation.

Install a model

  1. Install a model from the Ollama library:

    ollama pull <model-name>

    For example, to install the Llama 3.2 3B model:

    ollama pull llama3.2

    After the installation completes, the model is available for prompting on the Ollama server.

  2. Verify that the model works:

    curl http://localhost:11434/api/generate -d '{ "model": "llama3.2", "prompt": "Who are you?", "stream": false }'

    The above request asks the server to generate a response for the specified prompt with the specified model and to return the response in a single reply. If the model works, the server returns a JSON object containing the response and metadata about the response.

You have installed the model on the Ollama server. For more information about working with models, see the Ollama documentation.

Enable CORS for the Ollama server

By default, the Ollama server only accepts same-origin requests. Since GPT for Work always has a different origin from the Ollama server, you must enable cross-origin resource sharing (CORS) for the server using the OLLAMA_ORIGINS environment variable.

To enable CORS for the Ollama server:

  1. Set OLLAMA_ORIGINS with the origins that are allowed to access the server:

    # Set a single origin
    setx OLLAMA_ORIGINS "<ORIGIN>"

    # Set multiple origins
    setx OLLAMA_ORIGINS "<ORIGIN_1>, <ORIGIN_2>, ..."

    For example:

    • Allow any origin to make requests to the server:

      setx OLLAMA_ORIGINS "*"
    • Allow only GPT for Work to make requests to the server:

      setx OLLAMA_ORIGINS "https://excel-addin.gptforwork.com"
    note

    By default, the setx command sets a user variable. If you want to set a system variable, which applies to all users of the current machine, run the command (as an administrator) with the /m parameter. For example:

    setx OLLAMA_ORIGINS "*" /m
  2. Start the server from the Windows Start menu.

  3. Verify that the /api/tags endpoint of the server works:

    curl http://localhost:11434/api/tags

    GPT for Work uses the endpoint to fetch a list of models installed on the server. If the endpoint works, the server returns a JSON object with a models property listing all currently installed models:

    {
    "models": [
    {
    ...
    }
    ]
    }

You have enabled CORS for the Ollama server.

What's next

You have completed the setup required to access the Ollama server from GPT for Work on the same machine. You can now set http://localhost:11434 as the Ollama server URL in GPT for Work.