Ollama安装

curl -fsSL https://ollama.com/install.sh | sh

安装成功自动运行ollama服务，如需手动启动服务，运行如下命令

systemctl stop    ollama.service # 停止服务
systemctl restart ollama.service # 重启服务
systemctl enable  ollama.service # 开机启动

Ollama用法

ollama help

Usage:
  ollama [command]

Available Commands:
  serve       Start ollama
  create      Create a model from a Modelfile
  show        Show information for a model
  run         Run a model
  stop        Stop a running model
  pull        Pull a model from a registry
  push        Push a model to a registry
  list        List models
  ps          List running models
  cp          Copy a model
  rm          Remove a model
  help        Help about any command

下载并运行模型

#ollama run llama3.2
#ollama run gemma2
ollama run qwen2.5

>>> 你是谁
我是Qwen，一个由阿里云开发的语言模型助手。我被设计用来提供信息、回答问题和
协助用户完成各种任务。有什么我可以帮助你的吗？

>>> /bye

ollama list

NAME               ID              SIZE      MODIFIED    
qwen2.5:latest     845dbda0ea48    4.7 GB    1 days ago     
llama3.2:latest    a80c4f17acd5    2.0 GB    1 days ago

模型安装位置

/usr/share/ollama/.ollama/

其他模型参见：https://ollama.com/search

qwen2.5
qwen2.5-coder
llama3.2
llama3.2-vision
gemma2

Python接口

ollama服务启动后会监听本机11434接口，用于API为调用。

如果需要其他设备访问，需要添加环境变量，然后重启服务。

/etc/systemd/system/ollama.service

Environment="OLLAMA_HOST=0.0.0.0:11434"

安装ollama库

pip install ollama

Python测试代码

import ollama

def chat_ollama(question, model='qwen2.5'):
	text = ''
	stream = True
	ollama_host = 'http://127.0.0.1:11434'
	client = ollama.Client(host=ollama_host)
	
	response = client.chat(model=model, stream=stream, messages=[
		{'role': 'user', 'content': question},
	])
	
	if stream:
		for chunk in response:
			content = chunk['message']['content']
			text += content
			print(content, end='', flush=True)
	else:
		content = response['message']['content']
		text += content
		print(content)
	
	print('\n')
	return text

if __name__ == '__main__':
	chat_ollama('你是谁')

运行输出

我是Qwen，一个由阿里云开发的超大规模语言模型。我被设计用来回答问题、提供信息、参与对话以及帮助用户解决各种问题。如果你有任何疑问或需要帮助，都可以尝试和我说话哦！

Gradio创建webUI

Grdio是一个开源Python库，可以快速创建大语言模型的交互webUI，无需了解HTTP、CSS、JavaScript等web语言。

安装gradio库

pip install gradio

在上一个例子基础上添加gradio创建的UI

import ollama
import gradio as gr

def chat_ollama(question, model='qwen2.5'):
	text = ''
	stream = True
	ollama_host = 'http://127.0.0.1:11434'
	client = ollama.Client(host=ollama_host)
	
	response = client.chat(model=model, stream=stream, messages=[
		{'role': 'user', 'content': question},
	])
	
	if stream:
		for chunk in response:
			content = chunk['message']['content']
			text += content
			print(content, end='', flush=True)
			yield text
	else:
		content = response['message']['content']
		text += content
		print(content)
		yield text
	
	print('\n')
	return text

def chat_response(message, history):
	resp = chat_ollama(message)
	for r in resp:
		yield r

def webui():
	demo = gr.ChatInterface(fn=chat_response, type='messages', examples=['你好', '你是谁'])
	demo.launch(server_name='0.0.0.0')

if __name__ == '__main__':
	webui()

* Running on local URL:  http://0.0.0.0:7860

运行效果如下图所示

目录CONTENT

Ollama简明教程

Ollama安装

下载并运行模型

Python接口

Gradio创建webUI

评论区