解析pdf后，其中的图片会以base64格式上传，占用大量token #320

whitewatercn · 2025-01-12T14:14:00Z

我已确认目前没有类似 issue
我已确认我已升级到最新版本
我已完整浏览项目 README 和项目文档并未找到解决方案
我理解并愿意跟进此 issue，协助测试和提供反馈
我将以礼貌和尊重的态度提问，不得使用不文明用语 (包括在此发布评论的所有人同样适用, 不遵守的人将被 block)
我理解并认可上述内容，并理解项目维护者精力有限，不遵循规则的 issue 可能会被无视或直接关闭

问题描述
尝试使用gpt-4o模型概括该pdf内容，fpubh-12-1368933.pdf

结果报错
This model's maximum context length is 128000 tokens. However, your messages resulted in 180050 tokens. Please reduce the length of the messages. (type: invalid_request_error)

多模态模型图片如何计算占用tokens参考OpenAI官方 https://openai.com/api/pricing ，这么一个小小的pdf，里面也没几张图，怎么也不可能跑到180050 tokens

查看通过 https://blob.chatnio.net/ 解析后的内容（解析出来的内容见 https://gist.github.com/whitewatercn/d1dd7488a158e0e0b0a29d9008ff1a77 ），有一段超长的base64编码图片，通过转码工具可以确认是pdf里的图片被转成了base64编码，其规格为

Width: 1553
Height: 704
File size: 84064 bytes
The length of the Base64 encoded string is: 333728

怀疑这段超长的编码被当成文本处理，占用了大量token

在之前的issue中（ #215 ）提到

当模型不支持图片识别时，经过 base64 编码后的图片会被认为是文本，过长的文本（token）会导致过多的费用，所以采取判断省略用于防止出现该情况。
#215 (comment)

可是我使用的是gpt-4o

The text was updated successfully, but these errors were encountered:

Issues-translate-bot · 2025-01-12T14:14:12Z

Bot detected the issue body's language is not English, translate it automatically.

Title: After parsing the pdf, the images will be uploaded in base64 format, occupying a large amount of tokens.

whitewatercn added the bug Something isn't working label Jan 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

解析pdf后，其中的图片会以base64格式上传，占用大量token #320

解析pdf后，其中的图片会以base64格式上传，占用大量token #320

whitewatercn commented Jan 12, 2025 •

edited

Loading

Issues-translate-bot commented Jan 12, 2025

解析pdf后，其中的图片会以base64格式上传，占用大量token #320

解析pdf后，其中的图片会以base64格式上传，占用大量token #320

Comments

whitewatercn commented Jan 12, 2025 • edited Loading

Issues-translate-bot commented Jan 12, 2025

whitewatercn commented Jan 12, 2025 •

edited

Loading