Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove exif in image file #2168

Merged
merged 12 commits into from
Jun 21, 2022
Merged

Conversation

eziosudo
Copy link
Member

@eziosudo eziosudo commented Jun 19, 2022

What type of PR is this?

/kind feature

Optionally add one or more of the following kinds if applicable:
/kind api-change

What this PR does / why we need it:

Provide an option to user if he wants to remove Exif infomation when upload an image file.

Which issue(s) this PR fixes:

Fixes #1720

Special notes for your reviewer:

  1. Apache commons-imaging 只能支持 JPEG/JPG 的图片格式去除Exif信息。
    Reference:
    Apache Commons imaging
    ExfiRewriter
  2. 我尝试了另一个开源库 metadata-extractor. 可惜它只支持读取数据不支持修改或删除 issue583, issue503. 所以目前我还没有找到一个能够支持更多图片格式去删除Exif的Java库。(如果有更好的三方库,欢迎指出)
  3. 因为是字节流处理,单元测试不太好写。我在本地覆盖的测试用例如下:
    a). 打开RemoveExif开关,非图片格式(比如txt,pdf)不会走到remove exif的逻辑里。
    b). 关闭RemoeExif开关,ImageFilePreHandler的preProcess方法会直接返回,不做处理。
    c). 打开RemoveExif开关,JPG/JPEG格式图片能够在本地正常上传,并且Exif信息会被删除。
    d). 打开RemoveExif开关,PNG/BMP格式的图片能够在本地正常上传,在org.apache.commons.imaging.common.BinaryFunctions#readAndVerifyBytes()方法中因为字节流中的魔数校验不通过(不是JPG/JPEG格式)抛出异常返回,等于不会对里面的Exif信息做处理,直接返回原数据。
  4. 新增FilePreHandler作为预处理接口,以及FilePreHandlers作为预处理接口工厂,前置注入FilePreHandler实现类。
  5. 在ImageUtils方法里新增了isImageType(@nonnull MultipartFile file)方法,判断如果不是图片就没必要走预处理逻辑。这个方法是从run.halo.app.handler.file.FileHandler#isImageTypecopy过来的,因为觉得FilePreHandler去继承FileHandler不是很合适,就直接copy到工具类里了。
  6. 新增ImageMultipartFile实现MultipartFile接口。因为在upload接口中,入参和upload方法参数都是MultiFile,所以我选择了MultipartFile preProcess(MultipartFile file) 这样的接口定义,在方法中完成字节流操作,最后字节流转换成MultipartFile时需要使用新增的这个类。(Ps:这个类其实就是字节流转MultipartFile,其他业务可能也能用到,但我没有找到合适的摆放位置,就先按照Image的业务逻辑命名和处理了)
   @NonNull
    public UploadResult upload(@NonNull MultipartFile file,
        @NonNull AttachmentType attachmentType) {
       // 新增预处理逻辑
        file = filePreHandlers.doPreProcess(file);
        return getSupportedType(attachmentType).upload(file);
    }

Does this PR introduce a user-facing change?

会根据用户的附件设置决定是否去除图片Exif信息。
另一个PR ,admin页面的附件设置中增加了去除Exif信息的选项。

附件设置增加Exif去除选项,支持去除JPG/JPEG格式图片的Exif信息。

@f2c-ci-robot f2c-ci-robot bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API labels Jun 19, 2022
@eziosudo
Copy link
Member Author

commit中的IssueId都写错了,请忽略 -

@ruibaby
Copy link
Member

ruibaby commented Jun 19, 2022

/cc @halo-dev/sig-halo

/milestone 1.6.x

@f2c-ci-robot f2c-ci-robot bot added this to the 1.6.x milestone Jun 19, 2022
1. Modify for code review.
2. Keep 'Orientation' Exif tag for images.
@eziosudo
Copy link
Member Author

@guqing 新的提交:

  1. 已经把preProcess方法的出入参改为byte[], byte[]。其他review中的comment也已经做修改。
  2. 在测试过程中发现一个新问题,如果直接用removeExifMetadata()去除所有Exif,会导致图片方向不正确。所以必须保留"Orientation"这个tag,就换了updateExifMetadataLossless()这个方法。相关用例已在本地测试。
  3. 在PictureExifRemovalPreHandler的预处理方法中,通过 Imaging.getMetadata(bytes) 捕获 IllegalArgumentException 异常跳过非图片文件的上传。这样的话每次上传非图片文件会打一条日志 "Cannot parse to image format."。(这样可以吗?)

Copy link
Member

@JohnNiang JohnNiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @eziosudo, thanks for your contribution.

我这里有几个小建议,请花点儿时间看看,谢谢!

src/main/java/run/halo/app/handler/file/FileHandlers.java Outdated Show resolved Hide resolved
src/main/java/run/halo/app/utils/FileUtils.java Outdated Show resolved Hide resolved
src/main/java/run/halo/app/utils/FileUtils.java Outdated Show resolved Hide resolved
src/main/java/run/halo/app/utils/FileUtils.java Outdated Show resolved Hide resolved
src/main/java/run/halo/app/utils/ImageUtils.java Outdated Show resolved Hide resolved
@guqing
Copy link
Member

guqing commented Jun 21, 2022

@guqing 新的提交:

  1. 已经把preProcess方法的出入参改为byte[], byte[]。其他review中的comment也已经做修改。
  2. 在测试过程中发现一个新问题,如果直接用removeExifMetadata()去除所有Exif,会导致图片方向不正确。所以必须保留"Orientation"这个tag,就换了updateExifMetadataLossless()这个方法。相关用例已在本地测试。
  3. 在PictureExifRemovalPreHandler的预处理方法中,通过 Imaging.getMetadata(bytes) 捕获 IllegalArgumentException 异常跳过非图片文件的上传。这样的话每次上传非图片文件会打一条日志 "Cannot parse to image format."。(这样可以吗?)

good job

@eziosudo eziosudo requested review from JohnNiang and guqing and removed request for guqing June 21, 2022 04:47
Copy link
Member

@JohnNiang JohnNiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@f2c-ci-robot f2c-ci-robot bot added the lgtm Indicates that a PR is ready to be merged. label Jun 21, 2022
@guqing
Copy link
Member

guqing commented Jun 21, 2022

hi @eziosudo,经测试,发现并没有去除exif,你可以使用该网站测试一下 https://exif.tuchong.com/
我的测试过程:

  1. 设置 attachment_EXIF_remove_enable 标志为 true
  2. 手机拍摄一张图片使用 https://exif.tuchong.com/ 读取 exif
  3. 上传图片后到 halo home 目录 upload找到图片读取 exif
  4. 实际结果:上传前后 exif 信息没变化
    /hold

@f2c-ci-robot f2c-ci-robot bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 21, 2022
@JohnNiang
Copy link
Member

/lgtm

我在本地手动测试过,图片的 EXIF 信息确实被去除了。

测试图片来源:https://photographylife.com/wp-content/uploads/2016/04/Death-Valley-NP-5.jpg

  • 上传前

    image

  • 上传后

    image

但是生成的缩略图却成了这样:

image

当然这个问题不影响 PR 的合并。

@guqing
Copy link
Member

guqing commented Jun 21, 2022

去除前后都存在这些信息
telegram-cloud-photo-size-5-6163642330687582792-y
最主要是包括了位置和 gps等
Composite 属性也应该去除一下
image

@f2c-ci-robot f2c-ci-robot bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 21, 2022
@ruibaby
Copy link
Member

ruibaby commented Jun 21, 2022

我这边测试没有问题。

原图:

image

上传后:

image

EXIF 查看工具:https://exif.tuchong.com/

Copy link
Member

@ruibaby ruibaby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@ruibaby
Copy link
Member

ruibaby commented Jun 21, 2022

/unhold

@f2c-ci-robot f2c-ci-robot bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 21, 2022
Copy link
Member

@guqing guqing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@f2c-ci-robot f2c-ci-robot bot added the lgtm Indicates that a PR is ready to be merged. label Jun 21, 2022
Copy link
Member

@JohnNiang JohnNiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

手动测试结果:

image

/approve

@f2c-ci-robot
Copy link

f2c-ci-robot bot commented Jun 21, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JohnNiang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@f2c-ci-robot f2c-ci-robot bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 21, 2022
@f2c-ci-robot f2c-ci-robot bot merged commit f5d35dd into halo-dev:master Jun 21, 2022
f2c-ci-robot bot pushed a commit that referenced this pull request Oct 12, 2022
#### What type of PR is this?


#### What this PR does / why we need it:

发布 1.6.0

#### Special notes for your reviewer:

/cc @halo-dev/sig-halo 

```markdown
## Features

- SMTP 添加 STARTTLS 的设置支持。 #1861 halo-dev/console#552 @ntdgy @wangxiaoerYah
- 为邮件通知模板添加更多的可用参数,`email`、`status`、`createTime`、`authorUrl`。 #2095 @Yhcrown @iRoZhi
- 后台附件设置中添加 `去除图片 EXIF 信息` 的选项。 #2168 halo-dev/console#554 @SladeGranger @eziosudo @52lemon6
- 后台文章标签管理支持清理未使用标签。 halo-dev/console#587 @ruibaby
- 重构后台的友情链接管理页面,现在支持拖动排序以及分组管理。 halo-dev/console#574 #2105 @Camsyn @daifiyum @gungnir479

## Improvements

- 优化文章编辑时间更新的逻辑,目前更改为仅在修改文章标题或者内容时才更新。 #2195 @wxyShine @ListenV
- 修改后台管理页面为直接渲染而不是重定向,以更好地兼容反向代理。 #2259 @viticis
- 后台在构建时提供 gzip 压缩的资源。 halo-dev/console#547 @2211898719
- 后台主题设置界面顶部提供了保存设置的按钮。 halo-dev/console#549 @Aanko
- 修改后台页面的 lang 属性,由 `zh-cmn-Hans` 改为 `cmn-Han`,`zh-cmn-Hans` 已废弃。 halo-dev/console#576 @wordlesswind
- 优化后台附件库列表以及选择附件弹框的布局。 halo-dev/console#580 @ruibaby
- 修改后台底部的 `Power by Halo` 为 `Powered by Halo`。 halo-dev/console#597 @liaocp666
- 优化后台日志管理设置公开/隐藏状态的方式。 halo-dev/console#610 @zjy4fun @manction
- 优化后台在个人资料中设置头像的逻辑,现在无需再提交个人资料表单。 halo-dev/console#619 @wxyShine
- 优化后台图库批量从附件添加时,按照选择顺序倒序排列,即最先选择的图片在最前面。 halo-dev/console#631 @zjy4fun @zyy247796143

## Bug Fixes

- 修复文章详情页 `meta_description` 为空的问题。 #2282 @guqing @ruibaby
- 修复批量删除外部云存储的时候,因为文件不存在导致的删除失败问题。 #2317 @JustinLiang522 @129duckflew
- 修复了七牛云存储附件无法上传非图片文件的问题。 #2331 @AirboZH @hexWars
- 修复未审核评论回复会发送邮件提醒的问题。 #2340 @AirboZH @cetr
- 修复在单个分类所属文章页面页面变量中无法获取子分类中文章的问题。 #2405 @JustinLiang522 @HugeLeaf
- 修复管理员资料表单中邮件地址的字符数限制。 halo-dev/console#571 @Yhcrown
- 修复了无法设置日志、相册、链接页标题和每页显示条数的问题。 halo-dev/console#601 @JustinLiang522 @manction

## Dependencies

- Docker 镜像的基础镜像使用 Eclipse Temurin 镜像替代 AdoptOpenJDK。 #2120 @wordlesswind
```

#### Does this PR introduce a user-facing change?

```release-note
None
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

希望上传图片时可以抹除EXIF信息
4 participants