Ollama is an open-source AI tool designed to enable users to run and deploy large language models (LLMs) locally. Its goal is to provide a convenient and efficient way for developers to use models like GPT on their local machines without relying on cloud services. Ollama supports multiple models and focuses on optimizing performance, allowing even resource-constrained devices to smoothly run these models.
Through Ollama, users can utilize text-based AI applications and interact with locally deployed models without worrying about data privacy or high API usage fees. You can invoke different models via a command-line interface (CLI) for tasks such as natural language processing and question answering.
Ollama is suitable for experimenting with various models; after testing the Windows version, it couldn’t fully leverage the hardware’s performance, possibly due to the Windows version. When deploying 32b parameter models, with low memory and GPU load, the response speed is slow.
Hardware Overview
- Operating System: Windows 11
- CPU: i7-10700K
- Memory: 40GB
- Graphics Card: RTX 3060 12GB
Environment Setup
Add the system environment variable to facilitate subsequent use:
set OLLAMA_MODELS=E:\ollama
This variable specifies the location where Ollama models are stored.E:\ollama
is a folder path indicating that all local model files will be stored in this directory. Ollama will load and use the language models you download or deploy based on this path. You can store model files in other locations by simply changing this path.set OLLAMA_HOST=127.0.0.1:8000
This environment variable sets the host and port for the Ollama service.127.0.0.1
is the localhost address, meaning the Ollama service will only listen for requests from the local machine.8000
is the specified port number, indicating that the Ollama service will wait for and process requests on port 8000. You can change the port number if needed, but make sure it’s not already in use by another application.
set OLLAMA_ORIGINS=*
This environment variable controls which origins are allowed to access the Ollama service.*
indicates that all origins (i.e., all domains and IP addresses) can access the Ollama service. This is typically used in development and debugging environments, and in production environments, it’s usually necessary to specify stricter origin control, limiting only specific domains or IPs to access your service for enhanced security.
DeepSeek-R1 Model Deployment
ollama installation is straightforward, so we won’t detail it here.
Post-installation verification:
C:\Users\core>ollama -v
ollama version is 0.5.11
To deploy the model, refer to the official model page and select the appropriate parameter model: ollama run deepseek-r1:14b
The 14b parameter version effectively remembers conversation context; smaller parameter versions cannot retain context. The 32b parameter version is very sluggish when deployed locally and hasn’t been further tested.