[RFC] 018 - 多模型服务商一期:架构设计 & AWS Bedrock / Zhipu / Gemini / Moonshot 支持 #737
Replies: 13 comments 14 replies
-
技术选型讨论AI SDK VS LangChain 大的方向上,我觉得 chatCompletion 这个模块仍然基于 OpenAI 和 Vercel AI SDK ,不用 LangChain 了,流式输出 functionCall 的问题已经过去快一周了,官方都没解决。 另外,现在整个社区的趋势就是接口调用的风格和规范会尽量对齐 OpenAI 的 chat completion。因此用 openai + vercel ai sdk 基本上可以满足 80% 主流模型的诉求 。
|
Beta Was this translation helpful? Give feedback.
-
错误处理记录Zhipu初始化
Chat 运行态
{
"message": "test",
"name": "TypeError",
"stack": "TypeError: test\n at LobeZhipuAI.chat (webpack-internal:///(rsc)/./src/libs/agent-runtime/zhipu/index.ts:53:19)\n"
} |
Beta Was this translation helpful? Give feedback.
-
@arvinxx The error handling with stream should be: stream.on('error', (e) => handleError(e))
.pipe(b)
.on('error', (e) => handleError(e))
.pipe(c)
.on('error', (e) => handleError(e));
function handleError(e) { throw e; } // Maybe also some logging or smth. |
Beta Was this translation helpful? Give feedback.
-
节奏分为三期进行:
|
Beta Was this translation helpful? Give feedback.
-
LogoGemini https://icon-sets.iconify.design/logos/google-gemini/ |
Beta Was this translation helpful? Give feedback.
-
实现思路整体思路
LobeAIProvider支持接口
错误类型
|
Beta Was this translation helpful? Give feedback.
-
需要注意的点:
|
Beta Was this translation helpful? Give feedback.
-
对于配置表单UI建议用JSONSchema统一生成: https://github.com/rjsf-team/react-jsonschema-form/ |
Beta Was this translation helpful? Give feedback.
-
Zhipu AI 接入踩坑点记录由于智谱 V4 的API 朝 OpenAI 对齐了,所以整体接入比较顺利,不过仍然存在几个坑点: 图片base64需要移除前缀;智谱不支持
|
Beta Was this translation helpful? Give feedback.
-
AWS Bedrock 接入笔记基本上参考 https://sdk.vercel.ai/docs/guides/providers/aws-bedrock 来接入会比较顺利。 初始化client: const bedrockClient = new BedrockRuntimeClient({
region: process.env.AWS_REGION ?? 'us-east-1',
credentials: {
accessKeyId: process.env.AWS_ACCESS_KEY_ID ?? '',
secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY ?? '',
},
}); chat 接口: // Ask Claude for a streaming chat completion given the prompt
const bedrockResponse = await bedrockClient.send(
new InvokeModelWithResponseStreamCommand({
modelId: 'anthropic.claude-v2',
contentType: 'application/json',
accept: 'application/json',
body: JSON.stringify({
prompt: experimental_buildAnthropicPrompt(messages),
max_tokens_to_sample: 300,
}),
}),
);
// Convert the response into a friendly text-stream
const stream = AWSBedrockAnthropicStream(bedrockResponse);
// Respond with the stream
return new StreamingTextResponse(stream); 其中
但在接入 llama2 时则遇到了问题, llama 的 prompt 结构和 aws 的会有很大差别,比如 {
body: {
httpStatusCode: 400,
requestId: '*',
extendedRequestId: undefined,
cfId: undefined,
attempts: 1,
totalRetryDelay: 0
},
message: 'Malformed input request: #: extraneous key [max_tokens_to_sample] is not permitted, please reformat your input and try again.',
region: 'us-east-1',
type: 'ValidationException'
} 所以计划一期先不上 llama2。 |
Beta Was this translation helpful? Give feedback.
-
Google Gemini 接入核心是代理问题,所以参考 google-gemini/generative-ai-js#29 (comment) 在 provider route 加了一个 proxy,在 proxy 模式下,使用 nodejs runtime,线上仍然使用 edge runtime // due to the Chinese region does not support accessing Google / OpenAI
// we need to use proxy to access it
const proxyUrl = process.env.HTTP_PROXY_URL;
const useProxy = !!proxyUrl;
if (useProxy) {
const { setGlobalDispatcher, ProxyAgent } = require('undici');
setGlobalDispatcher(new ProxyAgent({ uri: proxyUrl }));
}
// undici only can be used in NodeJS.
// So when using proxy, switch to NodeJS runtime
export const runtime = useProxy ? 'nodejs' : 'edge'; |
Beta Was this translation helpful? Give feedback.
-
后续如何集成一个新 ProvidermodelProviders 元信息
import { ModelProviderCard } from '@/types/llm';
const ZhiPu: ModelProviderCard = {
chatModels: [
{
description: '最新的 GLM-4 、最大支持 128k 上下文、支持 Function Call 、Retreival',
displayName: 'GLM-4',
// functionCall: true,
id: 'glm-4',
tokens: 128_000,
},
{
description:
'实现了视觉语言特征的深度融合,支持视觉问答、图像字幕、视觉定位、复杂目标检测等各类多模态理解任务',
displayName: 'GLM-4 Vision',
id: 'glm-4v',
tokens: 128_000,
vision: true,
},
{
description: '最新的glm-3-turbo、最大支持 128k上下文、支持Function Call、Retreival',
displayName: 'GLM-3 Turbo',
// functionCall: true,
id: 'glm-3-turbo',
tokens: 128_000,
},
],
id: 'zhipu',
};
export default ZhiPu; 服务端实现添加环境变量在 新增runtime在 runtime 新增实现 LobeRuntimeAI, 需要包包含初始化、 chat 接口,并需要实现初始化错误、 业务逻辑错误这两个错误处理 路由集成在 chat/provider/route 接入实现 默认配置集成在 api/config 设定发给客户端是否开启的属性 客户端实现模型/服务商 iconComponents/ModelIcon 补充判断逻辑 selectors添加是否启用的逻辑 settings增加一个设置面板渲染组件 api配置面板在error/apikeyform新增配置表单 服务端:
客户端:
|
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
背景
随着 LobeChat 的发展,社区同学对模型服务商的多样性也产生了新的诉求。我们不能只锚着 OpenAI ,而是要丰富模型服务商的多样性,为用户提供更多的会话选择。
Beta Was this translation helpful? Give feedback.
All reactions