Preface
DeepSeek has been deployed using Ollama on Linux servers before
This time, I deployed it on a server without an external network (it should be said that it is more restricted). I encountered some pitfalls, so I will record it.
ollama
Ollama naturally cannot use the online installation script
According to the documentation of ollama
First download the installation package on the local computer based on the server's system and CPU architecture
curl -L /download/ -o
Then use scp and other tools to upload to the server
scp server address:/temp
After connecting to the server, unzip the installation, follow the ollama document (see the first reference)
sudo tar -C /usr -xzf
At this time, the ollama program can be executed
ollama serve
Then add it to the service, which is also the official recommended practice of Ollama, which is convenient for management.
sudo useradd -r -s /bin/false -U -m -d /usr/share/ollama ollama
sudo usermod -a -G ollama $(whoami)
Create a new file under /etc/systemd/system
[Unit]
Description=Ollama Service
After=
[Service]
ExecStart=/usr/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=$PATH"
[Install]
WantedBy=
Then enable the service
sudo systemctl daemon-reload
sudo systemctl enable ollama
Here the installation of ollama is done
Model deployment
Offline servers cannot use ollama pull to pull models
You need to download locally first, and you can perform the operation of ollama pull on your local computer
Then find the model file and upload it to the server
This is the general idea, let me introduce it in detail
Find the local model file
If there is no special configuration, the default model files of ollama are all in~/.ollama/models/blobs
inside
First execute the command to see the path to the specified model, for example, you want to find the deepseek-r1:32b model
ollama show deepseek-r1:32b --modelfile
Output after executing the command (excerpt)
FROM C:\Users\deali\.ollama\models\blobs\sha256-96c415656d377afbff962f6cdb2394ab092ccbcbaab4b82525bc4ca800fe8a49
TEMPLATE """{{- if .System }}{{ .System }}{{ end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1}}
{{- if eq .Role "user" }}<|User|>{{ .Content }}
{{- else if eq .Role "assistant" }}<|Assistant|>{{ .Content }}{{- if not $last }}<|end▁of▁sentence|>{{- end }}
{{- end }}
{{- if and $last (ne .Role "assistant") }}<|Assistant|>{{- end }}
{{- end }}"""
PARAMETER stop <|begin▁of▁sentence|>
PARAMETER stop <|end▁of▁sentence|>
PARAMETER stop <|User|>
PARAMETER stop <|Assistant|>
You can see this line
FROM C:\Users\deali\.ollama\models\blobs\sha256-96c415656d377afbff962f6cdb2394ab092ccbcbaab4b82525bc4ca800fe8a49
It is the path to the local model downloaded by ollama
Upload this file to the server
Export Modelfile
This file format is similar to Dockerfile
Export using the following command
ollama show deepseek-r1:32b --modelfile > Modelfile
Then the file must be uploaded to the server.
Importing models on the server
After the model file and Modelfile are uploaded, put it in the same directory.
Rename it first to facilitate subsequent imports
mv sha256-96c415656d377afbff962f6cdb2394ab092ccbcbaab4b82525bc4ca800fe8a49 deepseek-r1_32b.gguf
Then edit the Modelfile file and change the FROM line to, which is the model file name after the modification just now
FROM ./deepseek-r1_32b.gguf
Then execute the following command to import
ollama create deepseek-r1:32b -f Modelfile
If the import is successful without accident, you can execute itollama list
To see if it has been imported.
one-api
One API is an open source LLM (large language model) API management and distribution system, designed to access multiple large models uniformly through the standard OpenAI API format and use it out of the box. It supports a variety of mainstream big models, including OpenAI ChatGPT series, Anthropic Claude series, Google PaLM2/Gemini series, Mistral series, ByteDance Doubao big model, Baidu Wenxin Yiyan series model, Alibaba Tongyi Qianwen series model, Xun Feixinghuo cognitive model, Zhishu ChatGLM series model, Tencent Hunyuan big model, etc.
Docker deployment
One-api is developed using the gin framework of Go, and it is easy to deploy. I usually use docker to deploy. I won't repeat this.
services:
db:
image: mysql:8.1.0
container_name: mysql
restart: always
environment:
MYSQL_ROOT_PASSWORD: mysql-password
volumes:
- ./data:/var/lib/mysql
one-api:
image: justsong/one-api
container_name: one-api
restart: always
ports:
- "3000:3000"
depends:
- db
environment:
- SQL_DSN=root:mysql-password@tcp(db:3306)/one_api
- TZ=Asia/Shanghai
- TIKTOKEN_CACHE_DIR=/TIKTOKEN_CACHE_DIR
volumes:
- ./data:/data
- ./TIKTOKEN_CACHE_DIR:/TIKTOKEN_CACHE_DIR
networks:
default:
name: one-api
Solve tiktoken issues
The problem encountered is that it relies on the tiktoken library. Tiktoken needs to be downloaded online token encoder
The solution is to read the error log, for example
one-api | [FATAL] 2025/02/17 - 10:47:21 | relay/adaptor/openai/:26 [InitTokenEncoders] failed to get gpt-3.5-turbo token encoder: Get "/encodings/cl100k_base.tiktoken": dial tcp 57.150.97.129:443: i/o timeout, if you are using in offline environment, please set TIKTOKEN_CACHE_DIR to use exsited files
Here you need to/encodings/cl100k_base.tiktokendownload
We first download the file locally and upload it to the server
But it's not possible at this time
tiktoken only recognizes SHA-1 of URL
Generate SHA-1
TIKTOKEN_URL=/encodings/cl100k_base.tiktoken
echo -n $TIKTOKEN_URL | sha1sum | head -c 40
You can also synthesize a single line of commands
echo -n "/encodings/cl100k_base.tiktoken" | sha1sum | head -c 40
In this line of command,echo -n
Used to output the specified URL string (which-n
The function of the parameters isForbidden to add newline characters at the end of the output),sha1sum
Calculate its SHA-1 hash value,head -c 40
Intercept the first 40 characters, that is, the first 40 digits of the hash value.
The execution result is
9b5ad71b2ce5302211f9c61530b329a4922fc6a4
Then rename the cl100k_base.tiktoken file to the output9b5ad71b2ce5302211f9c61530b329a4922fc6a4
In the previous section, we have specified the TIKTOKEN_CACHE_DIR environment variable
Then put this 9b5ad71b2ce5302211f9c61530b329a4922fc6a4 file in the TIKTOKEN_CACHE_DIR directory.
If you encounter similar errors in the future, repeat the above operations until no errors are reported.
I'm currently using only two encoders downloaded
References
- /ollama/ollama/blob/main/docs/
- /questions/76106366/how-to-use-tiktoken-in-offline-mode-computer
- /cjdty/p/18659438
- /p/20485169539