Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-41238: [Release] Use UTF-8 as the default encoding to upload binary #41242

Merged
merged 2 commits into from
Apr 17, 2024

Conversation

kou
Copy link
Member

@kou kou commented Apr 16, 2024

Rationale for this change

We may have non ASCII characters in the process. For example, PGP uid may include non ASCII characters.

What changes are included in this PR?

Use LANG=C.UTF-8 and LC_*=C.UTF-8 to use UTF-8 as the default encoding.

Are these changes tested?

Yes. I used this for 16.0.0 RC0.

Are there any user-facing changes?

No.

… binary

We may have non ASCII characters in the process. For example, PGP uid
may include non ASCII characters.
@kou kou requested review from assignUser and raulcd as code owners April 16, 2024 21:10
Copy link

⚠️ GitHub issue #41238 has been automatically assigned in GitHub to PR creator.

@github-actions github-actions bot added the awaiting committer review Awaiting committer review label Apr 16, 2024
Copy link
Member

@assignUser assignUser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense 👍

Copy link
Contributor

@felipecrv felipecrv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My favorite LANG.

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting committer review Awaiting committer review labels Apr 17, 2024
@kou
Copy link
Member Author

kou commented Apr 17, 2024

I used this for 16.0.0 RC0 and confirmed that UTF-8 is used by debug print.

@kou kou merged commit d49b62d into apache:main Apr 17, 2024
7 checks passed
@kou kou deleted the release-binary-utf-8 branch April 17, 2024 02:57
@kou kou removed the awaiting merge Awaiting merge label Apr 17, 2024
Comment on lines +25 to +34
export LC_ADDRESS=C.UTF-8
export LC_CTYPE=C.UTF-8
export LC_IDENTIFICATION=C.UTF-8
export LC_MEASUREMENT=C.UTF-8
export LC_MONETARY=C.UTF-8
export LC_NAME=C.UTF-8
export LC_NUMERIC=C.UTF-8
export LC_PAPER=C.UTF-8
export LC_TELEPHONE=C.UTF-8
export LC_TIME=C.UTF-8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could set LC_ALL=C.UTF-8

Copy link
Member Author

@kou kou Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I misunderstood LC_ALL. I thought it can be overwritten by other LC_* like LANG.
I'll simplify the code. Thanks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@github-actions github-actions bot added the awaiting changes Awaiting changes label Apr 17, 2024
Copy link

After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit d49b62d.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 10 possible false positives for unstable benchmarks that are known to sometimes produce them.

raulcd pushed a commit that referenced this pull request Apr 29, 2024
#41242)

### Rationale for this change

We may have non ASCII characters in the process. For example, PGP uid may include non ASCII characters.

### What changes are included in this PR?

Use `LANG=C.UTF-8` and `LC_*=C.UTF-8` to use UTF-8 as the default encoding.

### Are these changes tested?

Yes. I used this for 16.0.0 RC0.

### Are there any user-facing changes?

No.
* GitHub Issue: #41238

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
tolleybot pushed a commit to tmct/arrow that referenced this pull request May 2, 2024
… binary (apache#41242)

### Rationale for this change

We may have non ASCII characters in the process. For example, PGP uid may include non ASCII characters.

### What changes are included in this PR?

Use `LANG=C.UTF-8` and `LC_*=C.UTF-8` to use UTF-8 as the default encoding.

### Are these changes tested?

Yes. I used this for 16.0.0 RC0.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#41238

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
tolleybot pushed a commit to tmct/arrow that referenced this pull request May 4, 2024
… binary (apache#41242)

### Rationale for this change

We may have non ASCII characters in the process. For example, PGP uid may include non ASCII characters.

### What changes are included in this PR?

Use `LANG=C.UTF-8` and `LC_*=C.UTF-8` to use UTF-8 as the default encoding.

### Are these changes tested?

Yes. I used this for 16.0.0 RC0.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#41238

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
rok pushed a commit to tmct/arrow that referenced this pull request May 8, 2024
… binary (apache#41242)

### Rationale for this change

We may have non ASCII characters in the process. For example, PGP uid may include non ASCII characters.

### What changes are included in this PR?

Use `LANG=C.UTF-8` and `LC_*=C.UTF-8` to use UTF-8 as the default encoding.

### Are these changes tested?

Yes. I used this for 16.0.0 RC0.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#41238

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
rok pushed a commit to tmct/arrow that referenced this pull request May 8, 2024
… binary (apache#41242)

### Rationale for this change

We may have non ASCII characters in the process. For example, PGP uid may include non ASCII characters.

### What changes are included in this PR?

Use `LANG=C.UTF-8` and `LC_*=C.UTF-8` to use UTF-8 as the default encoding.

### Are these changes tested?

Yes. I used this for 16.0.0 RC0.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#41238

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
vibhatha pushed a commit to vibhatha/arrow that referenced this pull request May 25, 2024
… binary (apache#41242)

### Rationale for this change

We may have non ASCII characters in the process. For example, PGP uid may include non ASCII characters.

### What changes are included in this PR?

Use `LANG=C.UTF-8` and `LC_*=C.UTF-8` to use UTF-8 as the default encoding.

### Are these changes tested?

Yes. I used this for 16.0.0 RC0.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#41238

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting changes Awaiting changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants