Ollama is an open-source AI tool designed to enable users to run and deploy large language models (LLMs) locally. Its goal is to provide a convenient and efficient way for developers to use models like GPT on their local machines without relying on cloud services. Ollama supports various models and focuses on optimizing performance, ensuring that even resource-constrained devices can run these models smoothly.
With Ollama, users can use text-based AI applications and interact with locally deployed models without worrying about data privacy or high API usage fees. You can call different models through the command-line interface (CLI) to perform tasks such as natural language processing and question answering.
Ollama is suitable for trying out different models. The Windows version, after testing, doesn’t fully utilize the hardware’s performance; this may be due to the Windows version itself. The Linux version might be better. When deploying a 32b parameter model, with low memory and GPU load, the response speed is very slow.
Hardware Overview
- Operating system: win11
- CPU:i7-10700K
- Memory: 40GB
- Graphics card: RTX 3060 12GB
Environmental preparation
Add a new system environment variable for convenient use later
-
set OLLAMA_MODELS=E:\ollama
This variable specifies the storage path for the Ollama model.E:\ollama
is a folder path indicating that all local model files are stored in this directory. Ollama will load and use your downloaded or deployed language models based on this path. You can store the model files in other locations, just change this path. -
set OLLAMA_HOST=127.0.0.1:8000
This environment variable sets the host and port for the Ollama service127.0.0.1
is the local address (localhost), meaning that the Ollama service will only listen for requests from the local machine- The port number 8000 is the designated port, indicating that the Ollama service will listen for and process requests on port 8000. You can change the port number as needed, but make sure it is not occupied by other applications.
-
set OLLAMA_ORIGINS=*
This environment variable controls which sources of requests are allowed to access the Ollama service- The
*
indicates that any source (i.e., all domains and IP addresses) is allowed to access the Ollama service. This is typically used in development and debugging environments; in production, you would usually specify stricter source control, limiting access only to specific domains or IPs to improve security.
- The
DeepSeek-R1 Model Deployment
Ollama installation is straightforward and will not be elaborated on here
Post-installation verification
C:\Users\core>ollama -v
ollama version is 0.5.11
Model deployment, refer to the official website model page and select the corresponding parameters for the model: ollama run deepseek-r1:14b
The 14B parameter model can effectively remember conversation context, while smaller parameter versions cannot. The 32B parameter version is very slow when deployed locally, so I didn’t conduct further testing.