Compare commits
10 Commits
c8a1d8968d
...
765201f9ef
| Author | SHA1 | Date | |
|---|---|---|---|
| 765201f9ef | |||
| eb82b1a174 | |||
| f1cdf6dd1f | |||
| 05ad8e647c | |||
| 2c6639fa43 | |||
| 65b43da03b | |||
| 434172755d | |||
| 10a6a7051a | |||
| 5db04c3c05 | |||
| 61d8ad319b |
17
.env
17
.env
@ -1,4 +1,19 @@
|
||||
# Qwen/Qwen3.5-4B
|
||||
# deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
|
||||
SILICONFLOW_API_KEY = "sk-sylilrjrtxlvecwhfusjkutclmppzuzhncfcfxtekxrzyjee"
|
||||
SILICONFLOW_BASE_URL = "https://api.siliconflow.cn/v1"
|
||||
SILICONFLOW_BASE_URL = "https://api.siliconflow.cn/v1"
|
||||
|
||||
# gemma4:e2b
|
||||
OLLAMA_API_KEY = "ollama"
|
||||
OLLAMA_BASE_URL = "http://localhost:11434/v1"
|
||||
|
||||
# MiniMax-M2.7
|
||||
MINIMAX_API_KEY = "sk-cp-wWkzvRP-BiQia-6izxvqgehEsHSz8v4_PtDJAuT3OI0s8QFcEOsxIHcQoZC2cVQTK3L09EUuu5HDArYMvKXFnf91jk8LuZ0tteS7-Wd4Lk2zDm8RqrKkrd4"
|
||||
MINIMAX_BASE_URL = "https://api.minimaxi.com/v1"
|
||||
|
||||
BAILIAN_API_KEY = "sk-8c8bec7a613249dbbed08bc3affeef72"
|
||||
BAILIAN_BASE_URL = "https://dashscope.aliyuncs.com/compatible-mode/v1"
|
||||
|
||||
# gemma4:e2b
|
||||
CHERRY_API_KEY = "w6qWTWfnmF5t9OKGDPYpCoLJlga4F7Ezj4OT2XiDtJ3PFCqG"
|
||||
CHERRY_BASE_URL = "https://open.cherryin.cc/v1"
|
||||
|
||||
24
.gitignore
vendored
Normal file
24
.gitignore
vendored
Normal file
@ -0,0 +1,24 @@
|
||||
# Python
|
||||
.venv/
|
||||
__pycache__/
|
||||
*.pyc
|
||||
*.pyo
|
||||
*.pyd
|
||||
.Python
|
||||
*.egg-info/
|
||||
dist/
|
||||
build/
|
||||
|
||||
# Env
|
||||
.env
|
||||
|
||||
# IDE
|
||||
.idea/
|
||||
.vscode/
|
||||
*.swp
|
||||
*.swo
|
||||
|
||||
# OS
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
.env
|
||||
@ -1 +1 @@
|
||||
3.11
|
||||
3.13
|
||||
|
||||
157
README.md
157
README.md
@ -1,23 +1,32 @@
|
||||
# LangChain Learning
|
||||
|
||||
[](https://github.com/your-repo/langchain-learning)
|
||||
[](https://github.com/your-repo/langchain-learning)
|
||||
[](https://www.python.org/)
|
||||
[](https://www.langchain.com/)
|
||||
[](https://www.langchain.com/)
|
||||
|
||||
> LangChain 框架学习项目,集成 SiliconFlow API
|
||||
> 基于 LangChain 0.3.27 的学习项目,集成 SiliconFlow & Ollama API
|
||||
|
||||
## 功能特性
|
||||
|
||||
- **多 LLM 集成**:支持 OpenAI API、SiliconFlow 及 LangChain 抽象层
|
||||
- **多 LLM 集成**:支持 OpenAI API、SiliconFlow、Ollama 及 LangChain 抽象层
|
||||
- **MCP 协议支持**:通过 MultiServerMCPClient 连接多个 MCP 服务器
|
||||
- **Agent 智能体**:基于 LangGraph 的 ReAct Agent 实现自主推理与工具调用
|
||||
- **RAG 检索增强生成**:基于向量库(FAISS)的文档检索与问答
|
||||
- **流式响应**:实时流式输出,带来更好的使用体验
|
||||
- **实战示例**:从基础到进阶的使用模式
|
||||
- **Prompt 工程**:多种 Prompt 模板构建方式
|
||||
- **输出解析**:支持 JSON 等格式解析
|
||||
- **工具调用 (Tool Calling)**:支持 @tool 装饰器和 StructuredTool 定义工具
|
||||
- **Token 用量追踪**:轻松监控 API 调用消耗
|
||||
- **内存管理**:实现对话历史持久化(ConversationBufferMemory, SummaryMemory)
|
||||
- **Rich 终端界面**:支持 Markdown 渲染、多行输入等高级交互
|
||||
- **模型测速工具**:测试模型的首字延迟 (TTFT) 和每秒生成速度 (TPS)
|
||||
|
||||
## 快速开始
|
||||
|
||||
### 1. 安装依赖
|
||||
|
||||
```bash
|
||||
pip install langchain>=1.2.15 langchain-community>=0.4.1 langchain-siliconflow>=1.0.0 requests>=2.33.1
|
||||
uv sync
|
||||
```
|
||||
|
||||
### 2. 配置环境变量
|
||||
@ -25,31 +34,89 @@ pip install langchain>=1.2.15 langchain-community>=0.4.1 langchain-siliconflow>=
|
||||
在项目根目录创建 `.env` 文件:
|
||||
|
||||
```env
|
||||
# SiliconFlow
|
||||
SILICONFLOW_API_KEY=your_api_key_here
|
||||
SILICONFLOW_BASE_URL=https://api.siliconflow.cn/v1
|
||||
|
||||
# Ollama / 本地模型
|
||||
OLLAMA_BASE_URL=http://localhost:11434/v1
|
||||
OLLAMA_API_KEY=ollama
|
||||
```
|
||||
|
||||
### 3. 运行示例
|
||||
|
||||
**直接调用 API(requests)**
|
||||
```bash
|
||||
python helloworld/helloworld.py
|
||||
```
|
||||
**Hello World 示例**
|
||||
|
||||
**LangChain + ChatOpenAI 接口**
|
||||
```bash
|
||||
python helloworld/helloworld_langchain_openai.py
|
||||
```
|
||||
| 示例 | 命令 | 说明 |
|
||||
|------|------|------|
|
||||
| 直接调用 API | `python helloworld/helloworld.py` | 使用 requests 直接调用 SiliconFlow API |
|
||||
| LangChain + ChatOpenAI | `python helloworld/helloworld_langchain_openai.py` | 通过 OpenAI 接口调用 LLM |
|
||||
| LangChain + ChatSiliconFlow | `python helloworld/helloworld_siliconflow.py` | 使用 LangChain SiliconFlow 集成 |
|
||||
| OpenAI 客户端 + SiliconFlow | `python helloworld/openai_siliconflow.py` | OpenAI 客户端兼容 SiliconFlow |
|
||||
|
||||
**LangChain + ChatSiliconFlow**
|
||||
```bash
|
||||
python helloworld/helloworld_siliconflow.py
|
||||
```
|
||||
**Prompt 示例**
|
||||
|
||||
**OpenAI 客户端 + SiliconFlow**
|
||||
```bash
|
||||
python helloworld/openai_siliconflow.py
|
||||
```
|
||||
| 示例 | 命令 | 说明 |
|
||||
|------|------|------|
|
||||
| PromptTemplate | `python prompt/prompt_demo.py` | 演示 PromptTemplate 模板构建 |
|
||||
| Few-shot Learning | `python prompt/fewshot_demo.py` | 带示例的少样本提示学习 |
|
||||
| 从文件加载 Prompt | `python prompt/promt_from_file.py` | 从 YAML 文件加载提示词模板 |
|
||||
|
||||
**输出解析示例**
|
||||
|
||||
| 示例 | 命令 | 说明 |
|
||||
|------|------|------|
|
||||
| JSON 解析器 | `python parser/json_parser_demo.py` | 使用 JsonOutputParser 解析 LLM 输出 |
|
||||
|
||||
**RAG 示例**
|
||||
|
||||
| 示例 | 命令 | 说明 |
|
||||
|------|------|------|
|
||||
| 基础 RAG | `python rag/rag_demo.py` | 基于 FAISS 向量库的检索问答系统 |
|
||||
|
||||
**工具调用示例**
|
||||
|
||||
| 示例 | 命令 | 说明 |
|
||||
|------|------|------|
|
||||
| 工具定义 | `python tools/tool_definition.py` | 演示 @tool 装饰器和 StructuredTool 定义方式 |
|
||||
| 工具调用 | `python tools/tool_demo.py` | 演示模型如何调用工具并获取结果 |
|
||||
|
||||
**MCP 示例**
|
||||
|
||||
| 示例 | 命令 | 说明 |
|
||||
|------|------|------|
|
||||
| MCP 客户端 | `python mcp/mcp_client.py` | 连接 MCP 服务器并获取工具列表 |
|
||||
| Agent + MCP | `python mcp/mcp_client_with_agent.py` | 使用 ReAct Agent 调用 MCP 工具 |
|
||||
| Agent + MCP (简易版) | `python mcp/mcp_client_with_agent_simple.py` | 简化版的 Agent MCP 调用 |
|
||||
|
||||
**MCP 服务端示例**
|
||||
|
||||
| 示例 | 命令 | 说明 |
|
||||
|------|------|------|
|
||||
| 天气服务 | `python mcp/get_weather_server.py` | 提供天气查询的 MCP 服务端 |
|
||||
| 数学服务 | `python mcp/math_server.py` | 提供数学计算的 MCP 服务端 |
|
||||
|
||||
**Token 用量示例**
|
||||
|
||||
| 示例 | 命令 | 说明 |
|
||||
|------|------|------|
|
||||
| Token 追踪 | `python token/token_demo.py` | 使用 get_openai_callback 追踪 token 消耗 |
|
||||
|
||||
**内存管理示例**
|
||||
|
||||
| 示例 | 命令 | 说明 |
|
||||
|------|------|------|
|
||||
| 基础记忆 | `python memory/memory_desc.py` | 演示不同类型的 Memory 对象 |
|
||||
| 带内存聊天 | `python memory/memory_demo.py` | 使用 ConversationBufferMemory 进行多轮对话 |
|
||||
| 无内存聊天 | `python memory/without_memory_demo.py` | 基础 LLM 聊天,无历史上下文 |
|
||||
| Rich 界面聊天 | `python memory/without_memory_demo_rich.py` | 使用 Rich 美化的无内存聊天界面 |
|
||||
|
||||
**Ollama 示例**
|
||||
|
||||
| 示例 | 命令 | 说明 |
|
||||
|------|------|------|
|
||||
| Rich 流式聊天 | `python ollama/ollama_rich_chat.py` | 支持 Markdown 渲染、多行输入的流式聊天 |
|
||||
| 模型测速工具 | `python ollama/tps_monitor.py` | 测量模型的 TTFT 和 TPS 性能 |
|
||||
|
||||
## 项目结构
|
||||
|
||||
@ -60,6 +127,36 @@ langchain-learning/
|
||||
│ ├── helloworld_langchain_openai.py # LangChain + ChatOpenAI
|
||||
│ ├── helloworld_siliconflow.py # LangChain + ChatSiliconFlow
|
||||
│ └── openai_siliconflow.py # OpenAI 客户端 + SiliconFlow
|
||||
├── prompt/
|
||||
│ ├── prompt_demo.py # PromptTemplate 模板示例
|
||||
│ ├── fewshot_demo.py # Few-shot Learning 示例
|
||||
│ ├── promt_from_file.py # 从文件加载 Prompt
|
||||
│ ├── prompt_from_file.yaml # Prompt YAML 模板文件
|
||||
│ └── prompt_from_file.json # Prompt JSON 模板文件
|
||||
├── parser/
|
||||
│ └── json_parser_demo.py # JSON 输出解析示例
|
||||
├── rag/
|
||||
│ └── rag_demo.py # RAG 检索增强生成示例
|
||||
├── tools/
|
||||
│ ├── tool_definition.py # 工具定义方式演示
|
||||
│ └── tool_demo.py # 工具调用完整流程演示
|
||||
├── mcp/
|
||||
│ ├── mcp_client.py # MCP 客户端基础用法
|
||||
│ ├── mcp_client_with_agent.py # Agent + MCP 工具调用
|
||||
│ ├── mcp_client_with_agent_simple.py # Agent + MCP 简易版
|
||||
│ ├── get_weather_server.py # 天气查询 MCP 服务端
|
||||
│ └── math_server.py # 数学计算 MCP 服务端
|
||||
├── token/
|
||||
│ └── token_demo.py # Token 用量追踪示例
|
||||
├── memory/
|
||||
│ ├── memory_desc.py # 演示 Memory 对象类型
|
||||
│ ├── memory_demo.py # 带内存的对话链示例
|
||||
│ ├── with_memory_demo.py # 手动管理内存的聊天示例
|
||||
│ ├── without_memory_demo.py # 无内存的基础聊天
|
||||
│ └── without_memory_demo_rich.py # Rich 界面的无内存聊天
|
||||
├── ollama/
|
||||
│ ├── ollama_rich_chat.py # Ollama 流式聊天(Rich 界面)
|
||||
│ └── tps_monitor.py # 模型性能测速工具
|
||||
├── main.py # 入口文件
|
||||
├── pyproject.toml # 项目配置
|
||||
└── README.md
|
||||
@ -67,15 +164,27 @@ langchain-learning/
|
||||
|
||||
## 可用模型
|
||||
|
||||
**SiliconFlow**
|
||||
- `deepseek-ai/DeepSeek-R1-0528-Qwen3-8B`
|
||||
- `Qwen/Qwen3.5-4B`
|
||||
- `Qwen/Qwen3-8B`
|
||||
|
||||
**Ollama (本地)**
|
||||
- `gemma4:26b`
|
||||
- `gemma4:e2b`
|
||||
- `deepseek-v3.1:671b-cloud`
|
||||
|
||||
## 技术栈
|
||||
|
||||
| 类别 | 技术 |
|
||||
|------|------|
|
||||
| 框架 | LangChain |
|
||||
| LLM 提供商 | SiliconFlow |
|
||||
| 框架 | LangChain 0.3.27 |
|
||||
| Agent | LangGraph |
|
||||
| LLM 提供商 | SiliconFlow, Ollama |
|
||||
| MCP | langchain-mcp-adapters |
|
||||
| 向量库 | FAISS |
|
||||
| 终端美化 | Rich |
|
||||
| 数据验证 | Pydantic |
|
||||
| 语言 | Python 3.11+ |
|
||||
|
||||
## 许可证
|
||||
|
||||
20
mcp/get_weather_server.py
Normal file
20
mcp/get_weather_server.py
Normal file
@ -0,0 +1,20 @@
|
||||
# uv add fastmcp
|
||||
import logging
|
||||
|
||||
from fastmcp import FastMCP
|
||||
|
||||
mcp = FastMCP("mcp demo")
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def get_weather(city: str) -> str:
|
||||
"""获取传入的城市的天气信息"""
|
||||
logging.info(f"调用了查询天气服务,传入的参数为{city}")
|
||||
return f"{city}的天气很好,阳光明媚,晴空万里"
|
||||
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
logging.info("启动一个可以通过MCP调用获取天气的服务")
|
||||
mcp.run(transport="streamable-http", host="127.0.0.1", port=9000)
|
||||
# mcp.run(transport="stdio", host="127.0.0.1", port=9000)
|
||||
29
mcp/math_server.py
Normal file
29
mcp/math_server.py
Normal file
@ -0,0 +1,29 @@
|
||||
from fastmcp import FastMCP
|
||||
import logging
|
||||
|
||||
# 配置日志记录器
|
||||
logging.basicConfig(
|
||||
level=logging.INFO, # 设置日志级别为 INFO
|
||||
format="%(asctime)s - %(levelname)s - %(message)s" # 日志格式
|
||||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# 创建 FastMCP 实例
|
||||
mcp = FastMCP("Math")
|
||||
|
||||
@mcp.tool()
|
||||
def add(a: int, b: int) -> int:
|
||||
"""Add two numbers"""
|
||||
logger.info("The add method is called: a=%d, b=%d", a, b) # 记录加法调用日志
|
||||
return a + b
|
||||
|
||||
@mcp.tool()
|
||||
def multiply(a: int, b: int) -> int:
|
||||
"""Multiply two numbers"""
|
||||
logger.info("The multiply method is called: a=%d, b=%d", a, b) # 记录乘法调用日志
|
||||
return a * b
|
||||
|
||||
if __name__ == "__main__":
|
||||
logger.info("Start math server through MCP") # 记录服务启动日志
|
||||
# mcp.run(transport="streamable-http",port=8081,path='/mcp') # 启动服务并使用标准输入输出通信
|
||||
mcp.run(transport="stdio") # 启动服务并使用标准输入输出通信(子进程)
|
||||
12
mcp/mcp_client.py
Normal file
12
mcp/mcp_client.py
Normal file
@ -0,0 +1,12 @@
|
||||
import asyncio
|
||||
from fastmcp import Client
|
||||
|
||||
client = Client("http://localhost:9000/mcp")
|
||||
|
||||
async def call_tool(city: str):
|
||||
async with client:
|
||||
result = await client.call_tool("get_weather", {"city": city})
|
||||
print(result)
|
||||
## 启动服务端后,再启动客户端进行连接和调用
|
||||
## 这种方式是通过fastmcp的客户端直接调用的
|
||||
asyncio.run(call_tool("北京"))
|
||||
102
mcp/mcp_client_with_agent.py
Normal file
102
mcp/mcp_client_with_agent.py
Normal file
@ -0,0 +1,102 @@
|
||||
import asyncio
|
||||
import logging
|
||||
|
||||
# from langchain.agents import create_react_agent
|
||||
from langgraph.prebuilt import create_react_agent
|
||||
from langchain_openai import ChatOpenAI
|
||||
from langchain_mcp_adapters.client import MultiServerMCPClient
|
||||
import os
|
||||
import dotenv
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.DEBUG,
|
||||
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
|
||||
)
|
||||
|
||||
dotenv.load_dotenv()
|
||||
|
||||
## 设置环境变量
|
||||
os.environ['OPENAI_API_KEY'] = os.getenv("OLLAMA_API_KEY")
|
||||
os.environ['OPENAI_BASE_URL'] = os.getenv("OLLAMA_BASE_URL")
|
||||
|
||||
# 默认的 'model_name': 'deepseek-ai/DeepSeek-V3.1',
|
||||
llm = ChatOpenAI(model="gemma4:26b")
|
||||
|
||||
|
||||
def print_optimized_result(agent_response):
|
||||
"""
|
||||
解析代理响应并输出优化后的结果。
|
||||
:param agent_response: 代理返回的完整响应
|
||||
"""
|
||||
messages = agent_response.get("messages", [])
|
||||
steps = [] # 用于记录计算步骤
|
||||
final_answer = None # 最终答案
|
||||
|
||||
for message in messages:
|
||||
if hasattr(message, "additional_kwargs") and "tool_calls" in message.additional_kwargs:
|
||||
# 提取工具调用信息
|
||||
tool_calls = message.additional_kwargs["tool_calls"]
|
||||
for tool_call in tool_calls:
|
||||
tool_name = tool_call["function"]["name"]
|
||||
tool_args = tool_call["function"]["arguments"]
|
||||
steps.append(f"调用工具: {tool_name}({tool_args})")
|
||||
elif message.type == "tool":
|
||||
# 提取工具执行结果
|
||||
tool_name = message.name
|
||||
tool_result = message.content
|
||||
steps.append(f"{tool_name} 的结果是: {tool_result}")
|
||||
elif message.type == "ai":
|
||||
# 提取最终答案
|
||||
final_answer = message.content
|
||||
|
||||
# 打印优化后的结果
|
||||
print("\n计算过程:")
|
||||
for step in steps:
|
||||
print(f"- {step}")
|
||||
if final_answer:
|
||||
print(f"\n最终答案: {final_answer}")
|
||||
|
||||
|
||||
async def execute():
|
||||
# 1. 创建langchain中的mcp客户端 —— uv add langchain_mcp_adapters
|
||||
client = MultiServerMCPClient(
|
||||
# mcp.run(transport="streamable-http", host="127.0.0.1", port=9000)
|
||||
{
|
||||
# 这里是定义服务端信息的,可以有多个服务端
|
||||
"weather": {
|
||||
"url": "http://localhost:9000/mcp",
|
||||
"transport": "streamable_http",
|
||||
},
|
||||
"math": {
|
||||
"command": "python", # npx uvx
|
||||
"args": ["./mcp/math_server.py"],
|
||||
"transport": "stdio"
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
try:
|
||||
# 2. 通过客户端获取工具列表
|
||||
## 这里是通过服务端获取,所以可能会有异常(比如服务端没有启动,或者网络连接有问题
|
||||
tools = await client.get_tools()
|
||||
|
||||
# 3. 创建一个智能代理,能够完成 思考--> 行动 --> 观察 --> 思考 --> 行动 --> ... --> 最终答案
|
||||
agent = create_react_agent(model=llm, tools=tools)
|
||||
while True:
|
||||
user_input = input("请输入你的问题(输入exit则退出) > ")
|
||||
if user_input == "exit":
|
||||
print("感谢使用,再见👋🏻")
|
||||
break
|
||||
|
||||
agent_response = await agent.ainvoke({"messages": user_input})
|
||||
## agent会自己去调用 tools ,不需要我们去进行调用
|
||||
print_optimized_result(agent_response)
|
||||
|
||||
finally:
|
||||
# 资源回收
|
||||
if hasattr(client, 'close'):
|
||||
client.close()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
asyncio.run(execute())
|
||||
56
mcp/mcp_client_with_agent_simple.py
Normal file
56
mcp/mcp_client_with_agent_simple.py
Normal file
@ -0,0 +1,56 @@
|
||||
import asyncio
|
||||
import logging
|
||||
|
||||
# from langchain.agents import create_react_agent
|
||||
from langgraph.prebuilt import create_react_agent
|
||||
from langchain_openai import ChatOpenAI
|
||||
from langchain_mcp_adapters.client import MultiServerMCPClient
|
||||
import os
|
||||
import dotenv
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.DEBUG,
|
||||
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
|
||||
)
|
||||
|
||||
dotenv.load_dotenv()
|
||||
|
||||
## 设置环境变量
|
||||
os.environ['OPENAI_API_KEY'] = os.getenv("OLLAMA_API_KEY")
|
||||
os.environ['OPENAI_BASE_URL'] = os.getenv("OLLAMA_BASE_URL")
|
||||
|
||||
# 默认的 'model_name': 'deepseek-ai/DeepSeek-V3.1',
|
||||
llm = ChatOpenAI(model="gemma4:26b")
|
||||
|
||||
|
||||
async def execute():
|
||||
# 1. 创建langchain中的mcp客户端
|
||||
# uv add langchain_mcp_adapters
|
||||
client = MultiServerMCPClient(
|
||||
{
|
||||
# 这里是定义服务端信息的,可以有多个服务端
|
||||
"weather": {
|
||||
"url": "http://localhost:9000/mcp",
|
||||
"transport": "streamable_http",
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
try:
|
||||
# 2. 通过客户端获取工具列表
|
||||
## 这里是通过服务端获取,所以可能会有异常(比如服务端没有启动,或者网络连接有问题
|
||||
tools = await client.get_tools()
|
||||
|
||||
# 3. 创建一个智能代理,能够完成 思考--> 行动 --> 观察 --> 思考 --> 行动 --> ... --> 最终答案
|
||||
agent = create_react_agent(model=llm, tools=tools)
|
||||
agent_response = await agent.ainvoke({"messages":"北京的天气怎么样?适合出门吗?"})
|
||||
print(agent_response)
|
||||
|
||||
finally:
|
||||
# 资源回收
|
||||
if hasattr(client, 'close'):
|
||||
client.close()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
asyncio.run(execute())
|
||||
59
memory/memory_demo.py
Normal file
59
memory/memory_demo.py
Normal file
@ -0,0 +1,59 @@
|
||||
import os
|
||||
import dotenv
|
||||
import logging
|
||||
from langchain_openai import ChatOpenAI
|
||||
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
|
||||
from langchain_core.runnables.history import RunnableWithMessageHistory
|
||||
from langchain_community.chat_message_histories import ChatMessageHistory
|
||||
from langchain_core.chat_history import BaseChatMessageHistory
|
||||
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.DEBUG,
|
||||
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
|
||||
)
|
||||
|
||||
dotenv.load_dotenv()
|
||||
|
||||
# 1. 设置环境变量
|
||||
os.environ['OPENAI_API_KEY'] = os.getenv("SILICONFLOW_API_KEY")
|
||||
os.environ['OPENAI_BASE_URL'] = os.getenv("SILICONFLOW_BASE_URL")
|
||||
|
||||
# 2. 初始化模型
|
||||
llm = ChatOpenAI(model="deepseek-ai/DeepSeek-V3.1")
|
||||
|
||||
# 3. 定义 Prompt (现代版无需手动处理 question 变量)
|
||||
prompt = ChatPromptTemplate.from_messages([
|
||||
("system", "你是一个万能的人工智能AI"),
|
||||
MessagesPlaceholder(variable_name="history"),
|
||||
("human", "{question}")
|
||||
])
|
||||
|
||||
# 4. 【核心改動】使用 LCEL 組合鏈
|
||||
# 這裡不需要 LLMChain,直接用管道符
|
||||
chain = prompt | llm
|
||||
|
||||
# 5. 管理記憶體 (現代版做法:使用字典存儲不同 Session 的歷史)
|
||||
store = {}
|
||||
|
||||
def get_session_history(session_id: str) -> BaseChatMessageHistory:
|
||||
if session_id not in store:
|
||||
store[session_id] = ChatMessageHistory()
|
||||
return store[session_id]
|
||||
|
||||
# 包裝成帶有記憶功能的鏈
|
||||
with_message_history = RunnableWithMessageHistory(
|
||||
chain,
|
||||
get_session_history,
|
||||
input_messages_key="question",
|
||||
history_messages_key="history",
|
||||
)
|
||||
|
||||
# 6. 執行調用
|
||||
config = {"configurable": {"session_id": "xiaoming_test"}}
|
||||
|
||||
res1 = with_message_history.invoke({"question": "我是小明"}, config=config)
|
||||
print(f"回答1: {res1.content}")
|
||||
|
||||
res2 = with_message_history.invoke({"question": "我是谁?"}, config=config)
|
||||
print(f"回答2: {res2.content}")
|
||||
78
memory/memory_desc.py
Normal file
78
memory/memory_desc.py
Normal file
@ -0,0 +1,78 @@
|
||||
import os
|
||||
|
||||
import dotenv
|
||||
from langchain.memory import ConversationBufferMemory, ConversationTokenBufferMemory, ConversationSummaryMemory
|
||||
from langchain_openai import ChatOpenAI
|
||||
from langchain.memory import ConversationBufferWindowMemory
|
||||
from langchain_core.messages import HumanMessage, AIMessage, trim_messages
|
||||
|
||||
memory = ConversationBufferMemory()
|
||||
memory.chat_memory.add_user_message("我叫小明")
|
||||
memory.chat_memory.add_ai_message("噢,好的,你叫小明")
|
||||
print(memory)
|
||||
print(memory.memory_key)
|
||||
print(memory.load_memory_variables({}))
|
||||
print('---------')
|
||||
|
||||
memory2 = ConversationBufferMemory(memory_key="memory2")
|
||||
memory2.chat_memory.add_user_message("我叫小明")
|
||||
memory2.chat_memory.add_ai_message("噢,好的,你叫小明")
|
||||
print(memory2)
|
||||
print(memory2.memory_key)
|
||||
print(memory2.load_memory_variables({}))
|
||||
print('---------')
|
||||
|
||||
memory3 = ConversationBufferMemory(return_messages=True)
|
||||
memory3.chat_memory.add_user_message("我叫小明")
|
||||
memory3.chat_memory.add_ai_message("噢,好的,你叫小明")
|
||||
print(memory3)
|
||||
print(memory3.memory_key)
|
||||
print(memory3.load_memory_variables({}))
|
||||
print('---------')
|
||||
dotenv.load_dotenv()
|
||||
|
||||
## 设置环境变量
|
||||
os.environ['OPENAI_API_KEY'] = os.getenv("OLLAMA_API_KEY")
|
||||
os.environ['OPENAI_BASE_URL'] = os.getenv("OLLAMA_BASE_URL")
|
||||
|
||||
# 默认的 'model_name': 'deepseek-ai/DeepSeek-V3.1',
|
||||
llm = ChatOpenAI(model="gemma4:e2b")
|
||||
|
||||
memory_test = ConversationSummaryMemory(llm=llm)
|
||||
memory_test.save_context({"input": "我叫小明"}, {"output": "噢,好的,你叫小明"})
|
||||
memory_test.save_context({"input": "那么你是谁呢"}, {"output": "我是一个无所不能的AI聊天助手,可以帮你解答任何问题。"})
|
||||
print(memory_test.load_memory_variables({}))
|
||||
# {'history': '\n\nThe human introduces themselves as xiaoming. The AI confirms the name and responds. The human then asks who the AI is. The AI introduces itself as an all-powerful AI chat assistant who can answer any questions.'}
|
||||
print('---------')
|
||||
|
||||
# 这是openai专用的OpenAI套件
|
||||
# memory_token = ConversationTokenBufferMemory(llm=llm,max_token_limit=7)
|
||||
# memory_token.save_context({"input": "我叫小明"}, {"output": "噢,好的,你叫小明"})
|
||||
# memory_token.save_context({"input": "那么你是谁呢"}, {"output": "我是一个无所不能的AI聊天助手,可以帮你解答任何问题。"})
|
||||
# print(memory_token.load_memory_variables({}))
|
||||
|
||||
# k=1 代表只記住「最新的一問一答」(1輪)
|
||||
memory_window = ConversationBufferWindowMemory(k=1)
|
||||
memory_window.save_context({"input": "我叫小明"}, {"output": "噢,好的,你叫小明"})
|
||||
memory_window.save_context({"input": "那么你是谁呢"}, {"output": "我是一个无所不能的AI聊天助手。"})
|
||||
print(memory_window.load_memory_variables({}))
|
||||
# 輸出結果只會保留最後一輪,因為 k=1
|
||||
print('---------')
|
||||
|
||||
messages = [
|
||||
HumanMessage("我叫小明"),
|
||||
AIMessage("噢,好的,你叫小明"),
|
||||
HumanMessage("那么你是谁呢"),
|
||||
AIMessage("我是一个无所不能的AI聊天助手。")
|
||||
]
|
||||
|
||||
# 現代化做法:直接裁剪訊息列表
|
||||
# 這裡我們用最簡單的 "只保留最後 2 條訊息" 策略
|
||||
trimmed_messages = trim_messages(
|
||||
messages,
|
||||
max_tokens=2,
|
||||
token_counter=len, # 關鍵:我們直接告訴它「算訊息條數」當作 token,避開底層 tiktoken 崩潰
|
||||
strategy="last",
|
||||
)
|
||||
|
||||
print(trimmed_messages)
|
||||
43
memory/with_memory_demo.py
Normal file
43
memory/with_memory_demo.py
Normal file
@ -0,0 +1,43 @@
|
||||
import logging
|
||||
|
||||
from langchain_core.messages import AIMessage, HumanMessage
|
||||
from langchain_core.output_parsers import JsonOutputParser
|
||||
from langchain_core.prompts import PromptTemplate, ChatPromptTemplate
|
||||
from langchain_openai import ChatOpenAI
|
||||
import os
|
||||
import dotenv
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
|
||||
)
|
||||
dotenv.load_dotenv()
|
||||
|
||||
## 设置环境变量
|
||||
os.environ['OPENAI_API_KEY'] = os.getenv("SILICONFLOW_API_KEY")
|
||||
os.environ['OPENAI_BASE_URL'] = os.getenv("SILICONFLOW_BASE_URL")
|
||||
|
||||
# 默认的 'model_name': 'deepseek-ai/DeepSeek-V3.1',
|
||||
llm = ChatOpenAI(model="Qwen/Qwen3-8B")
|
||||
|
||||
|
||||
def chat_with_llm():
|
||||
prompt_template = ChatPromptTemplate.from_messages([
|
||||
("system", "你是一个人工智能助手,你是万能的"),
|
||||
("human", "{question}")
|
||||
])
|
||||
while True:
|
||||
chain = prompt_template | llm
|
||||
user_input = input("请继续你的问题,如果没有问题了,输入 [quit] 结束会话:")
|
||||
if user_input == "quit":
|
||||
break
|
||||
response = chain.invoke({"question": user_input})
|
||||
print(f"AI:{response.content}")
|
||||
|
||||
## 将当前轮次的聊天内容(ai的回答和下一轮的问题)保存到prompt中,带入下一次聊天。
|
||||
## 这是非标准做法。
|
||||
prompt_template.messages.append(AIMessage(content=response.content))
|
||||
prompt_template.messages.append(HumanMessage(content=user_input))
|
||||
|
||||
|
||||
chat_with_llm()
|
||||
56
memory/without_memory_demo.py
Normal file
56
memory/without_memory_demo.py
Normal file
@ -0,0 +1,56 @@
|
||||
import logging
|
||||
|
||||
from rich.console import Console
|
||||
from rich.panel import Panel
|
||||
from rich.markdown import Markdown
|
||||
from langchain_core.output_parsers import JsonOutputParser
|
||||
from langchain_core.prompts import PromptTemplate, ChatPromptTemplate
|
||||
from langchain_openai import ChatOpenAI
|
||||
import os
|
||||
import dotenv
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.WARNING,
|
||||
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
|
||||
)
|
||||
dotenv.load_dotenv()
|
||||
|
||||
## 设置环境变量
|
||||
os.environ['OPENAI_API_KEY'] = os.getenv("SILICONFLOW_API_KEY")
|
||||
os.environ['OPENAI_BASE_URL'] = os.getenv("SILICONFLOW_BASE_URL")
|
||||
|
||||
# 默认的 'model_name': 'deepseek-ai/DeepSeek-V3.1',
|
||||
llm = ChatOpenAI(model="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B")
|
||||
|
||||
system_prompt = """
|
||||
你是一个由 方仔仔 开发的先进人工智能助手 木鸡鸡。
|
||||
你使用的模型是Pro 100.0 ProMaxUltra
|
||||
你拥有128G上下文
|
||||
【基本原则】
|
||||
- 优先提供真实、准确、可靠的信息。
|
||||
- 如果信息不确定,请明确说明,而不是猜测或编造。
|
||||
- 对复杂问题进行结构化拆解,逐步解释。
|
||||
|
||||
【交互风格】
|
||||
- 使用自然、专业、友好的语气。
|
||||
- 优先使用分点、分段来提升可读性。
|
||||
|
||||
【安全与限制】
|
||||
- 不提供违法、危险或有害行为的指导。
|
||||
- 不泄露或推测个人隐私与敏感信息。
|
||||
"""
|
||||
|
||||
def chat_with_llm():
|
||||
prompt_template = ChatPromptTemplate.from_messages([
|
||||
("system",system_prompt),
|
||||
("human","{question}")
|
||||
])
|
||||
while True:
|
||||
chain = prompt_template | llm
|
||||
user_input = input()
|
||||
if user_input == "quit":
|
||||
break
|
||||
response = chain.invoke({"question":user_input})
|
||||
print(f"AI:{response.content} + /n + '-----------------'")
|
||||
|
||||
chat_with_llm()
|
||||
82
memory/without_memory_demo_rich.py
Normal file
82
memory/without_memory_demo_rich.py
Normal file
@ -0,0 +1,82 @@
|
||||
import logging
|
||||
import os
|
||||
import dotenv
|
||||
|
||||
from langchain_core.prompts import ChatPromptTemplate
|
||||
from langchain_openai import ChatOpenAI
|
||||
|
||||
# 引入 rich 相关的库
|
||||
from rich.console import Console
|
||||
from rich.panel import Panel
|
||||
from rich.markdown import Markdown
|
||||
|
||||
# 1. 消除 httpx 的烦人日志
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
|
||||
)
|
||||
logging.getLogger("httpx").setLevel(logging.WARNING)
|
||||
|
||||
dotenv.load_dotenv()
|
||||
os.environ['OPENAI_API_KEY'] = os.getenv("SILICONFLOW_API_KEY")
|
||||
os.environ['OPENAI_BASE_URL'] = os.getenv("SILICONFLOW_BASE_URL")
|
||||
|
||||
llm = ChatOpenAI(model="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B")
|
||||
|
||||
system_prompt = """
|
||||
你是一个由 方仔仔 开发的先进人工智能助手 木鸡鸡。
|
||||
你使用的模型是Pro 100.0 ProMaxUltra
|
||||
你拥有128G上下文
|
||||
【基本原则】
|
||||
- 优先提供真实、准确、可靠的信息。
|
||||
- 如果信息不确定,请明确说明,而不是猜测或编造。
|
||||
- 对复杂问题进行结构化拆解,逐步解释。
|
||||
|
||||
【交互风格】
|
||||
- 使用自然、专业、友好的语气。
|
||||
- 优先使用分点、分段来提升可读性。
|
||||
|
||||
【安全与限制】
|
||||
- 不提供违法、危险或有害行为的指导。
|
||||
- 不泄露或推测个人隐私与敏感信息。
|
||||
"""
|
||||
|
||||
# 初始化 rich 的控制台
|
||||
console = Console()
|
||||
|
||||
def chat_with_llm():
|
||||
prompt_template = ChatPromptTemplate.from_messages([
|
||||
("system", system_prompt),
|
||||
("human", "{question}")
|
||||
])
|
||||
|
||||
# 【工程优化】:把 chain 的组装放在循环外面。
|
||||
# 就像你不需要每次问问题前都重新组装一次大脑一样。
|
||||
chain = prompt_template | llm
|
||||
|
||||
# 打印一个欢迎面板
|
||||
console.print(Panel("✨ 欢迎使用 木鸡鸡 AI 助手!输入 'quit' 退出。", border_style="green"))
|
||||
|
||||
while True:
|
||||
# 加点提示符,让输入更明显
|
||||
user_input = input("\n👤 你: ")
|
||||
|
||||
if user_input.strip().lower() == "quit":
|
||||
console.print("[dim]👋 再见![/dim]")
|
||||
break
|
||||
|
||||
# 调用模型获取回复
|
||||
response = chain.invoke({"question": user_input})
|
||||
|
||||
# 【核心魔法】:使用 rich 的 Panel 和 Markdown 渲染模型输出
|
||||
# border_style 可以换颜色,比如 "cyan" (青色), "magenta" (洋红), "green" (绿色)
|
||||
console.print(
|
||||
Panel(
|
||||
Markdown(response.content),
|
||||
title="🤖",
|
||||
border_style="cyan"
|
||||
)
|
||||
)
|
||||
|
||||
if __name__ == "__main__":
|
||||
chat_with_llm()
|
||||
178
ollama/llm_benchmark_dashboard_v2.html
Normal file
178
ollama/llm_benchmark_dashboard_v2.html
Normal file
@ -0,0 +1,178 @@
|
||||
|
||||
<!DOCTYPE html>
|
||||
<html lang="zh-TW">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<title>LLM Benchmark Dashboard V2</title>
|
||||
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.5.2/css/bootstrap.min.css">
|
||||
<style>
|
||||
body { background-color: #f1f4f9; font-family: 'Segoe UI', system-ui, -apple-system, sans-serif; padding: 40px; }
|
||||
.card { border: none; border-radius: 20px; box-shadow: 0 15px 35px rgba(0,0,0,0.1); overflow: hidden; }
|
||||
.header-box { background: linear-gradient(135deg, #00d2ff 0%, #3a7bd5 100%); color: white; padding: 40px; text-align: center; }
|
||||
.table thead th { background: #ffffff; color: #495057; border-bottom: 2px solid #dee2e6; text-transform: uppercase; font-size: 0.85rem; }
|
||||
.progress { height: 10px; border-radius: 5px; background-color: #e9ecef; }
|
||||
.progress-bar { background: linear-gradient(90deg, #3a7bd5, #00d2ff); }
|
||||
.text-success { color: #28a745 !important; }
|
||||
.text-danger { color: #dc3545 !important; }
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="container-fluid">
|
||||
<div class="card">
|
||||
<div class="header-box">
|
||||
<h1 class="display-4 font-weight-bold">LLM 推理效能大看板 V2</h1>
|
||||
<p class="lead">多維度對比:本地部署 vs 雲端 API | 數據驅動決策</p>
|
||||
<span class="badge badge-light">更新日期: 2026-04-14</span>
|
||||
</div>
|
||||
<div class="card-body p-0">
|
||||
<div class="table-responsive">
|
||||
<table class="table table-hover mb-0">
|
||||
<thead class="text-center">
|
||||
<tr>
|
||||
<th class="text-left">模型名稱</th>
|
||||
<th>首字延遲 (TTFT) ↓</th>
|
||||
<th>生成速度 (TPS)</th>
|
||||
<th>總耗時</th>
|
||||
<th>總字數</th>
|
||||
<th width="20%">速度視覺化</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody class="text-center">
|
||||
|
||||
<tr>
|
||||
<td><strong>local-gemma4:26b (128K)</strong><br><small class="text-muted">Local</small></td>
|
||||
<td class="text-danger font-weight-bold">87.99s</td>
|
||||
<td class="text-primary">10.79</td>
|
||||
<td>131.7s</td>
|
||||
<td>717</td>
|
||||
<td><div class="progress"><div class="progress-bar" style="width: 10.79%"></div></div></td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td><strong>local-gemma4:26b (32K)</strong><br><small class="text-muted">Local</small></td>
|
||||
<td class="text-danger font-weight-bold">78.67s</td>
|
||||
<td class="text-primary">10.16</td>
|
||||
<td>127.7s</td>
|
||||
<td>722</td>
|
||||
<td><div class="progress"><div class="progress-bar" style="width: 10.16%"></div></div></td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td><strong>local-gemma4:e4b</strong><br><small class="text-muted">Local</small></td>
|
||||
<td class="text-danger font-weight-bold">39.93s</td>
|
||||
<td class="text-primary">12.34</td>
|
||||
<td>110.4s</td>
|
||||
<td>1338</td>
|
||||
<td><div class="progress"><div class="progress-bar" style="width: 12.34%"></div></div></td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td><strong>ollama-deepseek-v3.1:671b-cloud</strong><br><small class="text-muted">Cloud (Ollama)</small></td>
|
||||
<td class="text-success font-weight-bold">1.04s</td>
|
||||
<td class="text-success">51.74</td>
|
||||
<td>7.3s</td>
|
||||
<td>479</td>
|
||||
<td><div class="progress"><div class="progress-bar" style="width: 51.74%"></div></div></td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td><strong>ollama-gemma4:31b-cloud</strong><br><small class="text-muted">Cloud</small></td>
|
||||
<td class="text-success font-weight-bold">0.85s</td>
|
||||
<td class="text-primary">31.79</td>
|
||||
<td>14.2s</td>
|
||||
<td>613</td>
|
||||
<td><div class="progress"><div class="progress-bar" style="width: 31.79%"></div></div></td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td><strong>ollama-glm-5:cloud</strong><br><small class="text-muted">Cloud (Ollama)</small></td>
|
||||
<td class="text-warning font-weight-bold">13.58s</td>
|
||||
<td class="text-success">102.25</td>
|
||||
<td>19.5s</td>
|
||||
<td>779</td>
|
||||
<td><div class="progress"><div class="progress-bar" style="width: 100%"></div></div></td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td><strong>ollama-kimi-k2.5:cloud</strong><br><small class="text-muted">Cloud (Ollama)</small></td>
|
||||
<td class="text-warning font-weight-bold">15.91s</td>
|
||||
<td class="text-primary">29.67</td>
|
||||
<td>23.3s</td>
|
||||
<td>505</td>
|
||||
<td><div class="progress"><div class="progress-bar" style="width: 29.67%"></div></div></td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td><strong>ollama-minimax-m2.7:cloud</strong><br><small class="text-muted">Cloud (Ollama)</small></td>
|
||||
<td class="text-danger font-weight-bold">40.40s</td>
|
||||
<td class="text-primary">3.75</td>
|
||||
<td>40.9s</td>
|
||||
<td>508</td>
|
||||
<td><div class="progress"><div class="progress-bar" style="width: 3.75%"></div></div></td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td><strong>百煉-qwen3-max</strong><br><small class="text-muted">Cloud</small></td>
|
||||
<td class="text-success font-weight-bold">0.86s</td>
|
||||
<td class="text-primary">6.18</td>
|
||||
<td>14.1s</td>
|
||||
<td>595</td>
|
||||
<td><div class="progress"><div class="progress-bar" style="width: 6.18%"></div></div></td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td><strong>百煉-qwen3.5-35b-a3b</strong><br><small class="text-muted">Cloud</small></td>
|
||||
<td class="text-danger font-weight-bold">37.64s</td>
|
||||
<td class="text-success">69.15</td>
|
||||
<td>39.2s</td>
|
||||
<td>543</td>
|
||||
<td><div class="progress"><div class="progress-bar" style="width: 69.15%"></div></div></td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td><strong>百煉-qwen3.6-plus</strong><br><small class="text-muted">Cloud</small></td>
|
||||
<td class="text-danger font-weight-bold">77.35s</td>
|
||||
<td class="text-primary">15.25</td>
|
||||
<td>83.3s</td>
|
||||
<td>507</td>
|
||||
<td><div class="progress"><div class="progress-bar" style="width: 15.25%"></div></div></td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td><strong>百煉-qwen3.6-plus-v2</strong><br><small class="text-muted">Cloud</small></td>
|
||||
<td class="text-danger font-weight-bold">47.14s</td>
|
||||
<td class="text-primary">15.58</td>
|
||||
<td>53.1s</td>
|
||||
<td>503</td>
|
||||
<td><div class="progress"><div class="progress-bar" style="width: 15.58%"></div></div></td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td><strong>直連-MiniMax-M2.7</strong><br><small class="text-muted">Cloud (Direct)</small></td>
|
||||
<td class="text-success font-weight-bold">1.19s</td>
|
||||
<td class="text-primary">1.97</td>
|
||||
<td>13.9s</td>
|
||||
<td>842</td>
|
||||
<td><div class="progress"><div class="progress-bar" style="width: 1.97%"></div></div></td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td><strong>硅基流動-DeepSeek-R1-Qwen-8B</strong><br><small class="text-muted">Cloud</small></td>
|
||||
<td class="text-warning font-weight-bold">10.15s</td>
|
||||
<td class="text-success">75.57</td>
|
||||
<td>13.4s</td>
|
||||
<td>398</td>
|
||||
<td><div class="progress"><div class="progress-bar" style="width: 75.57%"></div></div></td>
|
||||
</tr>
|
||||
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<footer class="mt-4 text-center text-muted">
|
||||
<small>由 AI 工程總監生成 | 數據來源: 手動跑分測試</small>
|
||||
</footer>
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
27
ollama/llm_benchmark_report_v2.md
Normal file
27
ollama/llm_benchmark_report_v2.md
Normal file
@ -0,0 +1,27 @@
|
||||
# LLM 推理性能基準測試報告 V2 (2026-04)
|
||||
|
||||
## 1. 全模型性能對比表
|
||||
| 模型名稱 | 部署類型 | TTFT (秒) | TPS (tokens/s) | 總耗時 (秒) | 總字數 |
|
||||
| :--- | :--- | :--- | :--- | :--- | :--- |
|
||||
| local-gemma4:26b (128K) | Local | 87.99 | 10.79 | 131.74 | 717 |
|
||||
| local-gemma4:26b (32K) | Local | 78.67 | 10.16 | 127.70 | 722 |
|
||||
| local-gemma4:e4b | Local | 39.93 | 12.34 | 110.43 | 1338 |
|
||||
| ollama-deepseek-v3.1:671b-cloud | Cloud (Ollama) | 1.04 | 51.74 | 7.30 | 479 |
|
||||
| ollama-gemma4:31b-cloud | Cloud | 0.85 | 31.79 | 14.22 | 613 |
|
||||
| ollama-glm-5:cloud | Cloud (Ollama) | 13.58 | 102.25 | 19.53 | 779 |
|
||||
| ollama-kimi-k2.5:cloud | Cloud (Ollama) | 15.91 | 29.67 | 23.29 | 505 |
|
||||
| ollama-minimax-m2.7:cloud | Cloud (Ollama) | 40.40 | 3.75 | 40.94 | 508 |
|
||||
| 百煉-qwen3-max | Cloud | 0.86 | 6.18 | 14.13 | 595 |
|
||||
| 百煉-qwen3.5-35b-a3b | Cloud | 37.64 | 69.15 | 39.22 | 543 |
|
||||
| 百煉-qwen3.6-plus | Cloud | 77.35 | 15.25 | 83.32 | 507 |
|
||||
| 百煉-qwen3.6-plus-v2 | Cloud | 47.14 | 15.58 | 53.11 | 503 |
|
||||
| 直連-MiniMax-M2.7 | Cloud (Direct) | 1.19 | 1.97 | 13.90 | 842 |
|
||||
| 硅基流動-DeepSeek-R1-Qwen-8B | Cloud | 10.15 | 75.57 | 13.40 | 398 |
|
||||
|
||||
## 2. 數據分析結論
|
||||
- **雲端極速化**:`GLM-5` (102 t/s) 與 `DeepSeek v3.1` (51 t/s) 展現了極致的雲端吞吐能力。
|
||||
- **本地 e4b 觀察**:即使是 4B 規模模型,在本地冷啟動仍需約 40 秒,說明啟動瓶頸(硬碟與 Ollama 服務初始化)與模型參數量的相關性較低,更受系統底層 IO 影響。
|
||||
- **穩定性提升**:直連 API 的 TTFT 普遍穩定在 1 秒左右,相比之下,各類中轉或代理層(如部分百煉接口)波動較大。
|
||||
|
||||
---
|
||||
*報告生成時間: 2026-04-14*
|
||||
76
ollama/ollama_rich_chat.py
Normal file
76
ollama/ollama_rich_chat.py
Normal file
@ -0,0 +1,76 @@
|
||||
import os
|
||||
from dotenv import load_dotenv
|
||||
from openai import OpenAI
|
||||
from rich.console import Console
|
||||
from rich.panel import Panel
|
||||
from rich.markdown import Markdown
|
||||
from rich.live import Live
|
||||
|
||||
load_dotenv()
|
||||
|
||||
# 1. 初始化 Rich 控制台
|
||||
console = Console()
|
||||
|
||||
# 2. 初始化 OpenAI 客戶端 (指向本地 Ollama 或 SiliconFlow)
|
||||
client = OpenAI(
|
||||
base_url=os.getenv("OLLAMA_BASE_URL"),
|
||||
api_key=os.getenv("OLLAMA_API_KEY")
|
||||
)
|
||||
|
||||
# 【核心架構 1】:維護一個對話歷史列表 (這就是 AI 的大腦記憶區)
|
||||
# 確保一開始把人設 (System Prompt) 塞進去
|
||||
chat_history = [
|
||||
{"role": "system", "content": "你是一個精通 Python 的高級工程師,請保持專業且友善的語氣。"}
|
||||
]
|
||||
|
||||
# 印出漂亮的歡迎畫面
|
||||
console.print(Panel("✨ 歡迎使用流式 AI 助手!輸入 'quit' 退出。", border_style="green"))
|
||||
|
||||
# 進入「你問我答」的無限循環
|
||||
while True:
|
||||
# 替换原来的单行 input()
|
||||
console.print("\n👤 [bold green]你 (支持多行输入,输入 '/send' 并回车发送,输入 'quit' 退出):[/bold green]")
|
||||
lines = []
|
||||
while True:
|
||||
line = input()
|
||||
if line.strip().lower() == 'quit':
|
||||
console.print("[dim]👋 再见![/dim]")
|
||||
exit() # 直接退出程序
|
||||
if line.strip() == '/send':
|
||||
break # 结束输入,跳出收集循环
|
||||
lines.append(line)
|
||||
|
||||
# 将多行列表拼接成一个包含真正换行符的完整字符串
|
||||
user_input = "\n".join(lines)
|
||||
|
||||
# 將使用者的新問題,追加進對話歷史中
|
||||
chat_history.append({"role": "user", "content": user_input})
|
||||
|
||||
# 呼叫大模型,並開啟流式輸出 (stream=True)
|
||||
# 注意這裡的 messages 傳入的是完整的 chat_history
|
||||
response_stream = client.chat.completions.create(
|
||||
model="gemma4:26b", # 替換成你實際運行的模型名稱
|
||||
messages=chat_history,
|
||||
stream=True
|
||||
)
|
||||
|
||||
full_response = ""
|
||||
|
||||
# 【核心架構 3】:使用 Live 區塊進行 UI 即時渲染
|
||||
with Live(Panel("思考中...", title="🤖 AI", border_style="cyan"), refresh_per_second=15) as live:
|
||||
for chunk in response_stream:
|
||||
content = chunk.choices[0].delta.content
|
||||
if content is not None:
|
||||
full_response += content
|
||||
# 即時更新青色的對話框
|
||||
live.update(
|
||||
Panel(
|
||||
Markdown(full_response),
|
||||
title="🤖 AI",
|
||||
border_style="cyan"
|
||||
)
|
||||
)
|
||||
|
||||
# 【核心架構 4】:將 AI 剛剛吐出來的完整回答,存回對話歷史中
|
||||
# 這樣下一輪對話時,AI 才會「記得」它自己剛剛說過什麼
|
||||
chat_history.append({"role": "assistant", "content": full_response})
|
||||
58
ollama/tps_monitor.py
Normal file
58
ollama/tps_monitor.py
Normal file
@ -0,0 +1,58 @@
|
||||
import os
|
||||
import time
|
||||
from openai import OpenAI
|
||||
from dotenv import load_dotenv
|
||||
|
||||
load_dotenv()
|
||||
|
||||
# 配置你的環境
|
||||
client = OpenAI(
|
||||
base_url=os.getenv("OLLAMA_BASE_URL", "你的URL"),
|
||||
api_key=os.getenv("OLLAMA_API_KEY", "你的APIKEY")
|
||||
)
|
||||
|
||||
def test_model_speed(model_name, prompt="請寫一篇關於未來AI發展的500字文章。"):
|
||||
print(f"🚀 正在測試模型: {model_name} ...")
|
||||
|
||||
start_time = time.time()
|
||||
first_token_time = None
|
||||
tokens_count = 0
|
||||
full_response = ""
|
||||
|
||||
try:
|
||||
stream = client.chat.completions.create(
|
||||
model=model_name,
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
stream=True
|
||||
)
|
||||
|
||||
for chunk in stream:
|
||||
if chunk.choices[0].delta.content:
|
||||
if first_token_time is None:
|
||||
# 紀錄首字時間 (TTFT)
|
||||
first_token_time = time.time() - start_time
|
||||
|
||||
content = chunk.choices[0].delta.content
|
||||
full_response += content
|
||||
# 粗略計算法:中文大約 1 字 = 0.6~1 token,英文 1 詞 = 1.3 token
|
||||
# 這裡直接用字數估算,或者如果你想更準確,可以計算 chunk 的數量
|
||||
tokens_count += 1
|
||||
|
||||
total_time = time.time() - start_time
|
||||
generation_time = total_time - first_token_time
|
||||
tps = tokens_count / generation_time if generation_time > 0 else 0
|
||||
|
||||
print("-" * 30)
|
||||
print(f"📊 測試結果:")
|
||||
print(f"⏱️ 首字延遲 (TTFT): {first_token_time:.2f} 秒")
|
||||
print(f"⚡ 生成速度 (TPS): {tps:.2f} tokens/s")
|
||||
print(f"🕒 總耗時: {total_time:.2f} 秒")
|
||||
print(f"📝 總字數: {len(full_response)} 字")
|
||||
print("-" * 30)
|
||||
|
||||
except Exception as e:
|
||||
print(f"❌ 測試出錯: {e}")
|
||||
|
||||
if __name__ == "__main__":
|
||||
# 替換成你實際想測的模型名稱
|
||||
test_model_speed("deepseek-v3.1:671b-cloud")
|
||||
38
parser/json_parser_demo.py
Normal file
38
parser/json_parser_demo.py
Normal file
@ -0,0 +1,38 @@
|
||||
import logging
|
||||
|
||||
from langchain_core.output_parsers import JsonOutputParser
|
||||
from langchain_core.prompts import PromptTemplate
|
||||
from langchain_openai import ChatOpenAI
|
||||
import os
|
||||
import dotenv
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.DEBUG,
|
||||
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
|
||||
)
|
||||
dotenv.load_dotenv()
|
||||
|
||||
## 设置环境变量
|
||||
os.environ['OPENAI_API_KEY'] = os.getenv("SILICONFLOW_API_KEY")
|
||||
os.environ['OPENAI_BASE_URL'] = os.getenv("SILICONFLOW_BASE_URL")
|
||||
|
||||
# 默认的 'model_name': 'deepseek-ai/DeepSeek-V3.1',
|
||||
llm = ChatOpenAI(model="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B")
|
||||
|
||||
|
||||
|
||||
output_parser = JsonOutputParser()
|
||||
print(output_parser.get_format_instructions())
|
||||
|
||||
message = "讲一个简短的笑话"
|
||||
prompt = PromptTemplate(
|
||||
template="用中文回答用户的请求。\n{format_instruction}\n{query}",
|
||||
input_variables = ["query"],
|
||||
partial_variables={"format_instruction":output_parser.get_format_instructions()}
|
||||
)
|
||||
|
||||
chain = prompt | llm | output_parser
|
||||
|
||||
response = chain.invoke({"query":message})
|
||||
print(response)
|
||||
print(type(response))
|
||||
BIN
prompt/__pycache__/prompt_demo.cpython-313.pyc
Normal file
BIN
prompt/__pycache__/prompt_demo.cpython-313.pyc
Normal file
Binary file not shown.
58
prompt/fewshot_demo.py
Normal file
58
prompt/fewshot_demo.py
Normal file
@ -0,0 +1,58 @@
|
||||
import logging
|
||||
|
||||
from langchain_core.prompts import FewShotChatMessagePromptTemplate, ChatPromptTemplate
|
||||
from langchain_openai import ChatOpenAI
|
||||
import os
|
||||
import dotenv
|
||||
|
||||
dotenv.load_dotenv()
|
||||
|
||||
## 设置环境变量
|
||||
os.environ['OPENAI_API_KEY'] = os.getenv("SILICONFLOW_API_KEY")
|
||||
os.environ['OPENAI_BASE_URL'] = os.getenv("SILICONFLOW_BASE_URL")
|
||||
|
||||
# 默认的 'model_name': 'deepseek-ai/DeepSeek-V3.1',
|
||||
llm = ChatOpenAI(model="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B")
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.DEBUG,
|
||||
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
|
||||
)
|
||||
|
||||
examples = [
|
||||
{"input": "1 || 1", "output": "2"},
|
||||
{"input": "1 || 2", "output": "3"},
|
||||
{"input": "1 || 3", "output": "4"}
|
||||
]
|
||||
|
||||
example_prompt = ChatPromptTemplate.from_messages(
|
||||
[
|
||||
("human", "{input}"),
|
||||
("ai", "{output}"),
|
||||
]
|
||||
)
|
||||
|
||||
few_show_prompt = FewShotChatMessagePromptTemplate(
|
||||
examples=examples,
|
||||
example_prompt=example_prompt
|
||||
)
|
||||
|
||||
|
||||
final_prompt = ChatPromptTemplate.from_messages(
|
||||
[
|
||||
("system","你是一个数学天才"),
|
||||
few_show_prompt,
|
||||
("human", "{input}")
|
||||
]
|
||||
)
|
||||
# question = final_prompt.invoke(input = {"input":"1 || 10"})
|
||||
# # llm : 1||10 ?
|
||||
# response = llm.invoke(question)
|
||||
# print(response)
|
||||
# 链式调用
|
||||
llm.invoke(final_prompt.invoke(input = {"input":"1 || 10"}))
|
||||
|
||||
|
||||
# 提示词的invoke输出给到了llm作为输入,和管道的概念一模一样
|
||||
chain = final_prompt | llm
|
||||
chain.invoke(input = {"input":"1 || 10"})
|
||||
42
prompt/prompt_demo.py
Normal file
42
prompt/prompt_demo.py
Normal file
@ -0,0 +1,42 @@
|
||||
import logging
|
||||
|
||||
from langchain_core.messages import SystemMessage, HumanMessage
|
||||
from langchain_core.prompts import PromptTemplate
|
||||
import langchain
|
||||
|
||||
from langchain_openai import ChatOpenAI
|
||||
import os
|
||||
import dotenv
|
||||
|
||||
dotenv.load_dotenv()
|
||||
|
||||
## 设置环境变量
|
||||
os.environ['OPENAI_API_KEY'] = os.getenv("SILICONFLOW_API_KEY")
|
||||
os.environ['OPENAI_BASE_URL'] = os.getenv("SILICONFLOW_BASE_URL")
|
||||
|
||||
# 默认的 'model_name': 'deepseek-ai/DeepSeek-V3.1',
|
||||
llm = ChatOpenAI(model="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B")
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
|
||||
)
|
||||
print(langchain.__version__)
|
||||
|
||||
## prompt
|
||||
system_message = SystemMessage(
|
||||
content="你是一个大数据方向的专家,用户提问时,你只需要精简的回答问题,回答内容不超过100个token")
|
||||
human_message = HumanMessage(content="我现在想要学习hive,你帮我指定一个学习计划把")
|
||||
message = [system_message, human_message]
|
||||
|
||||
print(human_message)
|
||||
|
||||
## 1. 创建PromptTemplate
|
||||
template = PromptTemplate.from_template(template="我现在想要学习{topic}和{topic2},你帮我指定一个学习计划把")
|
||||
## 2. 构建完整的提示词
|
||||
hadoop_prompt = template.format(topic="hadoop",topic2="spark")
|
||||
hadoop_prompt2 = template.invoke(input={"topic":"hadoop","topic2":"spark"})
|
||||
print(hadoop_prompt)
|
||||
print(hadoop_prompt2)
|
||||
# response = llm.invoke(hadoop_prompt)
|
||||
# print(response)
|
||||
5
prompt/prompt_from_file.json
Normal file
5
prompt/prompt_from_file.json
Normal file
@ -0,0 +1,5 @@
|
||||
{
|
||||
"_type": "prompt",
|
||||
"input_variables": ["topic","role"],
|
||||
"template": "请以{role}的身份介绍一下{topic}"
|
||||
}
|
||||
5
prompt/prompt_from_file.yaml
Normal file
5
prompt/prompt_from_file.yaml
Normal file
@ -0,0 +1,5 @@
|
||||
_type: "prompt"
|
||||
input_variables:
|
||||
- topic
|
||||
- role
|
||||
template: "请以{role}的身份介绍一下{topic}"
|
||||
5
prompt/promt_from_file.py
Normal file
5
prompt/promt_from_file.py
Normal file
@ -0,0 +1,5 @@
|
||||
from langchain_core.prompts import load_prompt
|
||||
|
||||
prompt = load_prompt("prompt_from_file.yaml",encoding="utf-8")
|
||||
print(prompt.format(topic = "spark",role = "老师"))
|
||||
print(prompt.invoke(input = {"topic":"hive","role":"大数据专家"}))
|
||||
@ -5,8 +5,15 @@ description = "Add your description here"
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.11"
|
||||
dependencies = [
|
||||
"langchain>=1.2.15",
|
||||
"langchain-community>=0.4.1",
|
||||
"langchain-siliconflow>=1.0.0",
|
||||
"faiss-cpu>=1.13.2",
|
||||
"fastmcp>=3.2.4",
|
||||
"langchain>=0.3.27",
|
||||
"langchain-community>=0.3.31",
|
||||
"langchain-core>=0.3.40",
|
||||
"langchain-mcp-adapters>=0.0.1",
|
||||
"langchain-openai>=0.3.35",
|
||||
"langchain-siliconflow>=0.1.3",
|
||||
"langgraph>=1.0.1",
|
||||
"requests>=2.33.1",
|
||||
"rich>=15.0.0",
|
||||
]
|
||||
|
||||
68
rag/rag_demo.py
Normal file
68
rag/rag_demo.py
Normal file
@ -0,0 +1,68 @@
|
||||
import logging
|
||||
|
||||
from langchain.chains.retrieval_qa.base import RetrievalQA
|
||||
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
|
||||
import os
|
||||
import dotenv
|
||||
from langchain_text_splitters import CharacterTextSplitter
|
||||
from langchain_community.vectorstores import FAISS
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
|
||||
)
|
||||
dotenv.load_dotenv()
|
||||
|
||||
## 设置环境变量
|
||||
os.environ['OPENAI_API_KEY'] = os.getenv("SILICONFLOW_API_KEY")
|
||||
os.environ['OPENAI_BASE_URL'] = os.getenv("SILICONFLOW_BASE_URL")
|
||||
|
||||
# 默认的 'model_name': 'deepseek-ai/DeepSeek-V3.1',
|
||||
llm = ChatOpenAI(model="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B")
|
||||
|
||||
## 1. 准备某个领域的文档:测试相关的知识
|
||||
docs = [
|
||||
"等价类划分是一种黑盒测试方法,将输入数据划分为有效等价类和无效等价类。",
|
||||
"边界值分析通常作为等价类划分的补充,重点测试输入输出的边界条件。",
|
||||
"集成测试用于验证模块间接口的正确性,常见策略包括自顶向下和自底向上。",
|
||||
"回归测试是在软件变更后执行的测试,确保原有功能不受新修改影响。",
|
||||
"性能测试包括负载测试、压力测试和耐久性测试,用于评估系统的响应能力。",
|
||||
"测试用例应包含测试ID、模块、前置条件、步骤、预期结果和优先级信息。"
|
||||
]
|
||||
|
||||
splitter = CharacterTextSplitter()
|
||||
|
||||
## 2. 切割文档(可选)
|
||||
texts = []
|
||||
for doc in docs:
|
||||
chunks = splitter.split_text(doc)
|
||||
texts.extend(chunks)
|
||||
logging.info("文档切分:原文=%s -> %d 个分片",doc,len(chunks))
|
||||
logging.info(texts)
|
||||
|
||||
## 3. embedding(向量化) 以及 建立向量库
|
||||
embeddings = OpenAIEmbeddings(model="netease-youdao/bce-embedding-base_v1")
|
||||
## 第一次调用 embedding模型:HTTP Request: POST https://api.siliconflow.cn/v1/embeddings "HTTP/1.1 200 OK"
|
||||
vectorstore = FAISS.from_texts(texts,embeddings)
|
||||
logging.info("构建向量数据库完成")
|
||||
logging.info(vectorstore)
|
||||
|
||||
## 4. 构建 RAG的调用链 k参数: topK
|
||||
retriever = vectorstore.as_retriever(search_type='similarity',search_kwargs={"k":2})
|
||||
## HTTP Request: POST https://api.siliconflow.cn/v1/embeddings "HTTP/1.1 200 OK"
|
||||
chain = RetrievalQA.from_chain_type(llm=llm,retriever=retriever)
|
||||
|
||||
query = "什么是等价类划分?"
|
||||
|
||||
## 检索过程探索
|
||||
retrieved_docs = retriever.get_relevant_documents(query)
|
||||
logging.info("---------")
|
||||
for retrieved_doc in retrieved_docs:
|
||||
logging.info(retrieved_doc)
|
||||
logging.info("---------")
|
||||
|
||||
## 5. 查询数据(通过模型自己去查数据库)
|
||||
## HTTP Request: POST https://api.siliconflow.cn/v1/chat/completions "HTTP/1.1 200 OK"
|
||||
response = chain.invoke(query)
|
||||
logging.info(response)
|
||||
|
||||
27
token/token_demo.py
Normal file
27
token/token_demo.py
Normal file
@ -0,0 +1,27 @@
|
||||
|
||||
from langchain_community.callbacks import get_openai_callback
|
||||
from langchain_core.prompts import FewShotChatMessagePromptTemplate, PromptTemplate
|
||||
from langchain_openai import ChatOpenAI
|
||||
import os
|
||||
import dotenv
|
||||
|
||||
dotenv.load_dotenv()
|
||||
|
||||
## 设置环境变量
|
||||
os.environ['OPENAI_API_KEY'] = os.getenv("SILICONFLOW_API_KEY")
|
||||
os.environ['OPENAI_BASE_URL'] = os.getenv("SILICONFLOW_BASE_URL")
|
||||
|
||||
# 默认的 'model_name': 'deepseek-ai/DeepSeek-V3.1',
|
||||
llm = ChatOpenAI(model="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B")
|
||||
template = PromptTemplate.from_template("给我讲一个关于{topic}的冷笑话")
|
||||
chain = template | llm
|
||||
|
||||
with get_openai_callback() as cb:
|
||||
response = chain.invoke({"topic":"特朗普"})
|
||||
print(response)
|
||||
|
||||
print("------")
|
||||
print(f"total_tokens:{cb.total_tokens}")
|
||||
print(f"prompt_tokens :{cb.prompt_tokens}")
|
||||
print(f"completion_tokens:{cb.completion_tokens}")
|
||||
print(f"total_cost:{cb.total_cost}")
|
||||
56
tools/tool_definition.py
Normal file
56
tools/tool_definition.py
Normal file
@ -0,0 +1,56 @@
|
||||
import sys
|
||||
|
||||
from langchain_core.tools import tool, StructuredTool
|
||||
from pydantic import BaseModel, Field
|
||||
from sqlalchemy import True_
|
||||
|
||||
|
||||
# 定义函数/tool
|
||||
|
||||
## 方式1:使用注解
|
||||
@tool
|
||||
def get_weather1(city: str) -> str:
|
||||
"""查询指定城市的最新天气信息"""
|
||||
# 通过api调用天气网站的接口,得到最新的天气信息
|
||||
return f"{city}当前的温度为18°C"
|
||||
|
||||
|
||||
class FieldInfo(BaseModel):
|
||||
city: str = Field(description="要查询的城市名称")
|
||||
|
||||
## 注解中可以通过传参覆盖原有函数的描述等信息
|
||||
@tool(
|
||||
name_or_callable="get_weather1",
|
||||
args_schema=FieldInfo,
|
||||
description="查询某个城市的天气,并返回温度信息",
|
||||
return_direct=True
|
||||
)
|
||||
def get_weather2(city: str) -> str:
|
||||
"""查询指定城市的最新天气信息"""
|
||||
# 通过api调用天气网站的接口,得到最新的天气信息
|
||||
return f"{city}当前的温度为18°C"
|
||||
|
||||
|
||||
## 方式2:
|
||||
def get_weather3(city: str) -> str:
|
||||
"""查询指定城市的最新天气信息"""
|
||||
# 通过api调用天气网站的接口,得到最新的天气信息
|
||||
return f"{city}当前的温度为18°C"
|
||||
|
||||
get_weather3_tool = StructuredTool.from_function(
|
||||
func=get_weather3,
|
||||
name="get_weather3",
|
||||
args_schema=FieldInfo,
|
||||
description="第三个返回天气的函数"
|
||||
)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 调用函数,并打印返回结果
|
||||
print(get_weather1("北京"))
|
||||
print(get_weather1("巴黎"))
|
||||
|
||||
print(f"name={get_weather3_tool.name}")
|
||||
print(f"args={get_weather3_tool.args}")
|
||||
print(f"description={get_weather3_tool.description}")
|
||||
print(f"return_direct={get_weather3_tool.return_direct}") # 直接返回:如果为false,就是会将返回值给到大模型,让大模型进一步加工后再返回。如果是true,则直接返回给用户。
|
||||
98
tools/tool_demo.py
Normal file
98
tools/tool_demo.py
Normal file
@ -0,0 +1,98 @@
|
||||
import sys
|
||||
|
||||
from langchain_core.messages import HumanMessage, ToolMessage
|
||||
from langchain_core.tools import tool, StructuredTool
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
import logging
|
||||
|
||||
from langchain_openai import ChatOpenAI
|
||||
import os
|
||||
import dotenv
|
||||
|
||||
|
||||
# 定义函数/tool
|
||||
|
||||
## 方式1:使用注解
|
||||
class FieldInfo(BaseModel):
|
||||
city: str = Field(description="要查询的城市名称")
|
||||
|
||||
|
||||
## 注解中可以通过传参覆盖原有函数的描述等信息
|
||||
@tool(
|
||||
name_or_callable="get_weather",
|
||||
args_schema=FieldInfo,
|
||||
description="查询某个城市的天气,并返回温度信息",
|
||||
return_direct=True
|
||||
)
|
||||
def get_weather(city: str) -> str:
|
||||
"""查询指定城市的最新天气信息"""
|
||||
# 通过api调用天气网站的接口,得到最新的天气信息
|
||||
return f"{city}当前的温度为18°C"
|
||||
|
||||
@tool
|
||||
def test(city: str) -> str:
|
||||
"""这个是一个测试函数"""
|
||||
# 通过api调用天气网站的接口,得到最新的天气信息
|
||||
return f"{city}当前的温度为18°C"
|
||||
######################################################################
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.DEBUG,
|
||||
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
|
||||
)
|
||||
|
||||
dotenv.load_dotenv()
|
||||
|
||||
## 设置环境变量
|
||||
# os.environ['OPENAI_API_KEY'] = os.getenv("SILICONFLOW_API_KEY")
|
||||
# os.environ['OPENAI_BASE_URL'] = os.getenv("SILICONFLOW_BASE_URL")
|
||||
os.environ['OPENAI_API_KEY'] = os.getenv("OLLAMA_API_KEY")
|
||||
os.environ['OPENAI_BASE_URL'] = os.getenv("OLLAMA_BASE_URL")
|
||||
# os.environ['OPENAI_API_KEY'] = os.getenv("MINIMAX_API_KEY")
|
||||
# os.environ['OPENAI_BASE_URL'] = os.getenv("MINIMAX_BASE_URL")
|
||||
|
||||
# 如果要使用函数调用,需要选择支持的模型
|
||||
# 如果不支持,会报错:Error code: 400 - {'code': 20037, 'message': 'Function call is not supported for this model.', 'data': None}
|
||||
# llm = ChatOpenAI(model="Qwen/Qwen2.5-7B-Instruct")
|
||||
llm = ChatOpenAI(model="gemma4:e2b")
|
||||
# llm = ChatOpenAI(model="MiniMax-M2.7")
|
||||
|
||||
|
||||
## 将模型与工具进行绑定
|
||||
llm_with_tools = llm.bind_tools([get_weather,test],tool_choice="auto")
|
||||
|
||||
messages = [HumanMessage(content="巴黎现在的气温是多少")]
|
||||
|
||||
## 使用哪个工具,由大模型自己选择
|
||||
response = llm_with_tools.invoke(messages)
|
||||
print(response)
|
||||
|
||||
## 模型会返回:需要调用get_weather,并且识别出来传参为 Paris,但是不会进行真正的调用
|
||||
|
||||
# 如果需要调用工具,则执行并把结果回传
|
||||
for call in getattr(response, "tool_calls", []) or []:
|
||||
# print(call)
|
||||
# {'name': 'get_weather', 'args': {'city': 'Paris'}, 'id': '019a305388b40ad0d961da5696e9fd2f', 'type': 'tool_call'}
|
||||
if call["name"] == "get_weather":
|
||||
args = call["args"] # 例如 {"city": "Paris"}
|
||||
result = get_weather.invoke(args)
|
||||
messages.append(response) # 把模型消息加入对话
|
||||
messages.append(
|
||||
ToolMessage(
|
||||
content=result,
|
||||
tool_call_id=call["id"], # 必须把这次tool调用的id对上
|
||||
)
|
||||
)
|
||||
# 5) 第二轮:把工具结果交回给模型,让它产出最终回答
|
||||
final_msg = llm.invoke(messages)
|
||||
print("FINAL:", final_msg.content)
|
||||
break
|
||||
else:
|
||||
# 模型未请求调用工具,直接给出回答
|
||||
print("FINAL(no-tool):", response.content)
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Loading…
Reference in New Issue
Block a user