Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

解析pdf后,其中的图片会以base64格式上传,占用大量token #320

Open
6 tasks done
whitewatercn opened this issue Jan 12, 2025 · 1 comment
Open
6 tasks done
Labels
bug Something isn't working

Comments

@whitewatercn
Copy link

whitewatercn commented Jan 12, 2025

  • 我已确认目前没有类似 issue
  • 我已确认我已升级到最新版本
  • 我已完整浏览项目 README 和项目文档并未找到解决方案
  • 我理解并愿意跟进此 issue,协助测试和提供反馈
  • 我将以礼貌和尊重的态度提问,不得使用不文明用语 (包括在此发布评论的所有人同样适用, 不遵守的人将被 block)
  • 我理解并认可上述内容,并理解项目维护者精力有限,不遵循规则的 issue 可能会被无视或直接关闭

问题描述
尝试使用gpt-4o模型概括该pdf内容,fpubh-12-1368933.pdf

结果报错
This model's maximum context length is 128000 tokens. However, your messages resulted in 180050 tokens. Please reduce the length of the messages. (type: invalid_request_error)

图片

多模态模型图片如何计算占用tokens参考OpenAI官方 https://openai.com/api/pricing ,这么一个小小的pdf,里面也没几张图,怎么也不可能跑到180050 tokens

查看通过 https://blob.chatnio.net/ 解析后的内容( 解析出来的内容见 https://gist.github.com/whitewatercn/d1dd7488a158e0e0b0a29d9008ff1a77 ),有一段超长的base64编码图片,通过转码工具可以确认是pdf里的图片被转成了base64编码,其规格为

Width: 1553
Height: 704
File size: 84064 bytes
The length of the Base64 encoded string is: 333728

怀疑这段超长的编码被当成文本处理,占用了大量token

图片

在之前的issue中( #215 )提到

当模型不支持图片识别时,经过 base64 编码后的图片会被认为是文本,过长的文本(token)会导致过多的费用,所以采取判断省略用于防止出现该情况。
#215 (comment)

可是我使用的是gpt-4o

@whitewatercn whitewatercn added the bug Something isn't working label Jan 12, 2025
@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


Title: After parsing the pdf, the images will be uploaded in base64 format, occupying a large amount of tokens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants