Set up Ollama on Windows
This guide walks you through the main steps of setting up an Ollama server for use in GPT for Work on Windows.
This guide assumes that you use GPT for Work on the same machine that hosts Ollama.
Prerequisites
Local machine with enough processing power and memory to run LLMs (see the Ollama documentation for recommendations)
Windows 10 or newer (see the Ollama documentation for detailed system requirements)
Windows user account with administrator privileges
Installed software:
curl (ships with Windows)
To set up the Ollama server on Windows:
Install the Ollama server
Download and run the Windows installer. Follow the on-screen instructions to complete the installation.
The installer starts the Ollama server in the background and sets the server to start automatically on system boot. The installer also installs the Ollama desktop application for easily starting and stopping the server.
tipYou can manually start a new instance of the Ollama server in a terminal by running
ollama serve
. However, to avoid port conflicts, make sure no other Ollama server instances are running at the same time.Open PowerShell as an administrator and verify that the server is running:
curl http://localhost:11434
Response if the server is running:
noteMake sure you're using curl and not an aliased command, such as
Invoke-WebRequest
:Remove-Item alias:curl
If
curl
has an alias, the command removes it for the current terminal session, ensuring that subsequentcurl
commands in this guide work as expected.
You have installed the Ollama server.
For more information about installing and managing the server on Windows, see the Ollama documentation.
Install a model
Install a model from the Ollama library:
ollama pull <model-name>
For example, to install the Llama 3.2 3B model:
ollama pull llama3.2
After the installation completes, the model is available for prompting on the Ollama server.
Verify that the model works:
curl http://localhost:11434/api/generate -d '{ "model": "llama3.2", "prompt": "Who are you?", "stream": false }'
The above request asks the server to generate a response for the specified prompt with the specified model and to return the response in a single reply. If the model works, the server returns a JSON object containing the response and metadata about the response.
You have installed the model on the Ollama server. For more information about working with models, see the Ollama documentation.
Enable CORS for the Ollama server
By default, the Ollama server only accepts same-origin requests. Since GPT for Work always has a different origin from the Ollama server, you must enable cross-origin resource sharing (CORS) for the server using the OLLAMA_ORIGINS
environment variable.
To enable CORS for the Ollama server:
Set
OLLAMA_ORIGINS
with the origins that are allowed to access the server:# Set a single origin
setx OLLAMA_ORIGINS "<ORIGIN>"
# Set multiple origins
setx OLLAMA_ORIGINS "<ORIGIN_1>, <ORIGIN_2>, ..."For example:
Allow any origin to make requests to the server:
setx OLLAMA_ORIGINS "*"
Allow only GPT for Work to make requests to the server:
setx OLLAMA_ORIGINS "https://excel-addin.gptforwork.com"
noteBy default, the
setx
command sets a user variable. If you want to set a system variable, which applies to all users of the current machine, run the command (as an administrator) with the/m
parameter. For example:setx OLLAMA_ORIGINS "*" /m
Start the server from the Windows Start menu.
Verify that the
/api/tags
endpoint of the server works:curl http://localhost:11434/api/tags
GPT for Work uses the endpoint to fetch a list of models installed on the server. If the endpoint works, the server returns a JSON object with a
models
property listing all currently installed models:
You have enabled CORS for the Ollama server.
You have completed the setup required to access the Ollama server from GPT for Work on the same machine. You can now set http://localhost:11434
as the Ollama server URL in GPT for Work.