Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with Restyler Job #2742861 #194

Closed
bzbarsky-apple opened this issue Jan 17, 2023 · 13 comments
Closed

Problem with Restyler Job #2742861 #194

bzbarsky-apple opened this issue Jan 17, 2023 · 13 comments

Comments

@bzbarsky-apple
Copy link

Please use the below template to report an issue with this Job. Don't submit
sensitive information
, this is a public project. If you'd rather communicate
privately, email [email protected].


Hi there-

I'm having a problem with a Restyled Job

What I expected to happen: The job should pass, because I already locally ran the restyled.io docker bits on my code.

What happened instead: The job failed with:

  Exited non-zero (137) for the following paths, ["src/app/clusters/ethernet-network-diagnostics-server/ethernet-network-diagnostics-server.cpp","src/controller/java/zap-generated/CHIPReadCallbacks.cpp","src/controller/java/zap-generated/CHIPReadCallbacks.h","src/darwin/Framework/CHIP/zap-generated/MTRCallbackBridge.h","src/darwin/Framework/CHIP/zap-generated/MTRCallbackBridge.mm","src/include/platform/DiagnosticDataProvider.h","src/platform/Linux/ConnectivityUtils.cpp","src/platform/Linux/ConnectivityUtils.h","src/platform/Linux/DiagnosticDataProviderImpl.cpp","src/platform/Linux/DiagnosticDataProviderImpl.h","src/platform/nxp/mw320/ConnectivityUtils.h","src/platform/webos/ConnectivityUtils.cpp","src/platform/webos/ConnectivityUtils.h","src/platform/webos/DiagnosticDataProviderImpl.cpp","src/platform/webos/DiagnosticDataProviderImpl.h","zzz_generated/app-common/app-common/zap-generated/attributes/Accessors.cpp","zzz_generated/app-common/app-common/zap-generated/attributes/Accessors.h","zzz_generated/app-common/app-common/zap-generated/cluster-enums-check.h","zzz_generated/app-common/app-common/zap-generated/cluster-enums.h","zzz_generated/app-common/app-common/zap-generated/cluster-objects.h","zzz_generated/app-common/app-common/zap-generated/enums.h","zzz_generated/chip-tool/zap-generated/cluster/logging/DataModelLogger.cpp","zzz_generated/chip-tool/zap-generated/test/Commands.h"].

Configuration (if applicable):

#
#    Copyright (c) 2020 Project CHIP Authors
#
#    Licensed under the Apache License, Version 2.0 (the "License");
#    you may not use this file except in compliance with the License.
#    You may obtain a copy of the License at
#
#        http://www.apache.org/licenses/LICENSE-2.0
#
#    Unless required by applicable law or agreed to in writing, software
#    distributed under the License is distributed on an "AS IS" BASIS,
#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#    See the License for the specific language governing permissions and
#    limitations under the License.
#

# Do anything at all
enabled: true

# Push the style fixes directly to the original PR
auto: false

# Open Restyle PRs?
pull_requests: true

# Add labels to any created Restyle PRs
labels:
    - restyled

# Labels to ignore
ignore_labels:
    - restyled-ignore

# Leave comments on the original PR linking to the Restyle PR?
comments: false

# Request review on the Restyle PR
request_review: none

# Patterns to exclude from all Restylers
exclude:
    - ".github/workflows/**/*" # https://github.com/restyled-io/restyler/issues/73
    - ".github/workflows/*" # https://github.com/restyled-io/restyler/issues/73
    - ".github/**/*" # https://github.com/restyled-io/restyler/issues/73
    - ".github/*" # https://github.com/restyled-io/restyler/issues/73
    - "*/**/third_party/*" # https://github.com/restyled-io/restyled.io/issues/213
    - "*/**/third_party/**" # https://github.com/restyled-io/restyled.io/issues/213
    - "*/**/third_party/**/*" # https://github.com/restyled-io/restyled.io/issues/213
    - "third_party/android/**/*"
    - "third_party/inipp/repo/**/*"
    - "third_party/jlink/**/*"
    - "third_party/lwip/repo/**/*"
    - "third_party/nlbuild-autotools/repo/**/*"
    - "third_party/nlbuild-autotools/repo/.**/*" # "**/*" doesn't include directories that start with "."
    - "third_party/nlassert/repo/**/*" # from here down built with $ awk '/path =/ {print $3 "/**/*"}' .gitmodules
    - "third_party/nlfaultinjection/repo/**/*"
    - "third_party/nlio/repo/**/*"
    - "third_party/nlunit-test/repo/**/*"
    - "third_party/mbedtls/repo/**/*"
    - "examples/common/QRCode/repo/**/*"
    - "examples/common/m5stack-tft/repo/**/*"
    - "third_party/pigweed/repo/**/*"
    - "third_party/openthread/repo/**/*"
    - "third_party/ot-br-posix/repo/**/*"
    - "third_party/bluez/repo/**/*"
    - "third_party/cirque/repo/**/*"
    - "third_party/nanopb/repo/**/*"
    - "examples/android/CHIPTool/gradlew" # gradle wrapper generated file
    - "third_party/android_deps/gradlew" # gradle wrapper generated file
    - "src/controller/python/chip/clusters/Objects.py" # generated file, no point to restyle
    - "src/controller/python/chip/clusters/CHIPClusters.py" # generated file, no point to restyle
    - "scripts/py_matter_idl/matter_idl/tests/outputs/**/*" # Matches generated output 1:1
    - "examples/chef/sample_app_util/test_files/*.yaml"
    - "examples/chef/zzz_generated/**/*"
    - "examples/platform/nxp/k32w/k32w0/scripts/demo_generated_certs/**/*"
    - "integrations/cloudbuild/*.yaml" # uglier long command line content
    - "scripts/run_codegen_targets.sh" # shellharden breaks for loops over command outputs
    - "src/darwin/Framework/CHIP/zap-generated/MTRBaseClusters.h" # Too big?
    - "src/darwin/Framework/CHIP/zap-generated/MTRBaseClusters.mm" # Too big?


changed_paths:
  maximum: 100000
  outcome: error

# Restylers to run, and how
restylers:
    - name: whitespace
      enabled: true
      image: restyled/restyler-whitespace:v0.1.0.1
      command:
          - whitespace
      arguments: []
      include:
          - "**/Dockerfile"
          - ".**/Dockerfile"
          - "**/*.yml"
          - "**/*.yaml"
          - ".**/*.yml"
          - ".**/*.yaml"
          - "**/*.sh"
          - "**/*.c"
          - "**/*.cc"
          - "**/*.cpp"
          - "**/*.ipp"
          - "**/*.cxx"
          - "**/*.c++"
          - "**/*.C"
          - "**/*.cs"
          - "**/*.h"
          - "**/*.hh"
          - "**/*.hpp"
          - "**/*.hxx"
          - "**/*.h++"
          - "**/*.H"
          - "**/*.java"
          - "**/*.js"
          - "**/*.json"
          - ".**/*.js"
          - ".**/*.json"
          - "**/*.m"
          - "**/*.mm"
    - name: google-java-format
      enabled: true
      # Update https://github.com/project-chip/connectedhomeip/blob/master/scripts/tools/zap/generate.py if this version changes.
      image: restyled/restyler-google-java-format:v1.6
      command:
          - google-java-format
          - "--replace"
      arguments: []
      include:
          - "**/*.java"
      interpreters: []
    - name: clang-format
      enabled: true
      image: restyled/restyler-clang-format:v9.0.0
      command:
          - clang-format
          - "-i"
      arguments: []
      include:
          - "**/*.c"
          - "**/*.cc"
          - "**/*.cpp"
          - "**/*.ipp"
          - "**/*.cxx"
          - "**/*.c++"
          - "**/*.C"
          - "**/*.cs"
          - "**/*.h"
          - "**/*.hh"
          - "**/*.hpp"
          - "**/*.hxx"
          - "**/*.h++"
          - "**/*.H"
          - "**/*.js"
          - "**/*.m"
          - "**/*.mm"
      interpreters: []
    - name: gn
      image: restyled/restyler-gn:v1
      enabled: true
      include:
          - "**/*.gn"
          - "**/*.gni"
    - name: prettier-json
      image: restyled/restyler-prettier:v1.19.1-2
      enabled: true
      command:
          - prettier
          - "--write"
      arguments: []
      include:
          - "**/*.json"
          - ".**/*.json"
      interpreters: []
    - name: prettier-markdown
      image: restyled/restyler-prettier:v1.19.1-2
      command:
          - prettier
          - "--write"
      arguments: []
      enabled: true
      include:
          - "**/*.md"
          - "**/*.markdown"
    - name: prettier-yaml
      image: restyled/restyler-prettier:v1.19.1-2
      enabled: true
      include:
          - "**/*.yml"
          - "**/*.yaml"
    - name: shellharden
      image: restyled/restyler-shellharden:v4.1.1-2
      enabled: true
      include:
          - "**/*.sh"
          - "**/*.bash"
    - name: shfmt
      image: restyled/restyler-shfmt:v3.0.1
      enabled: true
      command:
          - shfmt
          - "-w"
      arguments:
          - "-i"
          - "4"
          - "-ci"
      interpreters:
          - sh
          - bash
      include:
          - "**/*.sh"
          - "**/*.bash"
    - name: autopep8
      image: 'restyled/restyler-autopep8:v2.0.0'
      command:
        - autopep8
        - '--in-place'
      arguments: []
      include:
        - '**/*.py'
      interpreters:
        - python
    - name: isort
      image: 'restyled/restyler-isort:v5.11.2'
      command:
        - isort
      arguments: []
      include:
        - '**/*.py'
      interpreters:
        - python
@pbrisbin
Copy link
Member

Hi there, thanks for the report!

The 137 exit code indicates out of memory. In other words, clang-format, when run on these files, requires more memory than we allow restyler containers to use (512MB). Here are the files in a slightly more readable form,

src/app/clusters/ethernet-network-diagnostics-server/ethernet-network-diagnostics-server.cpp
src/controller/java/zap-generated/CHIPReadCallbacks.cpp
src/controller/java/zap-generated/CHIPReadCallbacks.h
src/darwin/Framework/CHIP/zap-generated/MTRCallbackBridge.h
src/darwin/Framework/CHIP/zap-generated/MTRCallbackBridge.mm
src/include/platform/DiagnosticDataProvider.h
src/platform/Linux/ConnectivityUtils.cpp
src/platform/Linux/ConnectivityUtils.h
src/platform/Linux/DiagnosticDataProviderImpl.cpp
src/platform/Linux/DiagnosticDataProviderImpl.h
src/platform/nxp/mw320/ConnectivityUtils.h
src/platform/webos/ConnectivityUtils.cpp
src/platform/webos/ConnectivityUtils.h
src/platform/webos/DiagnosticDataProviderImpl.cpp
src/platform/webos/DiagnosticDataProviderImpl.h
zzz_generated/app-common/app-common/zap-generated/attributes/Accessors.cpp
zzz_generated/app-common/app-common/zap-generated/attributes/Accessors.h
zzz_generated/app-common/app-common/zap-generated/cluster-enums-check.h
zzz_generated/app-common/app-common/zap-generated/cluster-enums.h
zzz_generated/app-common/app-common/zap-generated/cluster-objects.h
zzz_generated/app-common/app-common/zap-generated/enums.h
zzz_generated/chip-tool/zap-generated/cluster/logging/DataModelLogger.cpp
zzz_generated/chip-tool/zap-generated/test/Commands.h

There are 3 possible causes for this:

  1. Our memory limit is unreasonable and should be raised

    I'm open to arguments to raise the memory limit, though I think they'd have to be pretty compelling. 512MB is already quite a lot, and raising a limit as a solution usually just leads to a new bug report when the higher limit is exceeded. If we want to go this direction, I'd like to know how much clang-format really needs (as measured in local runs) and if/how it scales with the number and size of input files.

  2. Restyled is doing something that causes more memory to be consumed than is reasonable

    As is common, this restyler does effectively nothing except execute clang-format. So I think this is extremely unlikely. We do invoke clang-format with all files at once, and changing it to run on one file at a time (which, though undocumented, can be done in .restyled.yaml) could trade memory for runtime -- but only if the issue is due to the number of files and not a single large file.

  3. clang-format is doing something that causes more memory to be consumed than is reasonable

    If a tool holds all files in memory for the entire process, a large list of files can exceed our limits. If a tool does slightly better, and holds only the current file in memory, a very large file can exceed our limits. Either case would be a bug in the upstream tool for not processing things in a streaming fashion. If this bug is not fixable (maybe it needs whole-program analysis to do its job), then our option is to avoid the limits by ensuring PRs change fewer files and/or excluding large files from Restyling.

With all that in mind: are you sure the zzz_generated/ files need reformatting? Often generated files are quite large and can trigger problems with memory limits. If you exclude them it may work.

@bzbarsky-apple
Copy link
Author

We're investigating not restyling the zzz_generated files; there are some issues around that which need to be sorted out on our end before we can do that.

In the meantime, I'm happy to do the measurement for item 1 if you tell me the exact thing you want measured.

For item 2, if you tell me what option in .restyled.yaml to try, I can see if that will at least get us unblocked for the moment.

For item 3, the PRs involved are changing the minimal possible number of files already: they are renaming a type and doing a search/replace of the type name in the places where it's used.

@pbrisbin
Copy link
Member

I'm happy to do the measurement for item 1 if you tell me the exact thing you want measured

In an ideal would, I'd like to see something like the following filled out,

Number of input files Maximum size of file Maximum memory used by clang-format
1 <something small>
1 <something large>
50 <something small>
50 <something large>

I'm not exactly sure what "something small/large" is, or how to measure the maximum memory used by a process. This is kind of an open-ended debugging we're going into together here.

if you tell me what option in .restyled.yaml to try, I can see if that will at least get us unblocked for the moment.

Unfortunately, on second look it's not as easy as I thought to do that. Let me think about this one a bit and see if I can do some work to enable that.

I'm curious to learn more about how this is blocking. If you've run the restyle locally and are sure Restyled wouldn't change anything it were to succeed, can you not just ignore the error for this particular PR?

@bzbarsky-apple
Copy link
Author

bzbarsky-apple commented Jan 18, 2023

For what it's worth, I am now seeing the error on project-chip/connectedhomeip#24398 where last week that same PR (with very minor changes) was passing.

In an ideal would, I'd like to see something like the following filled out,

OK, so you just want the memory used by clang-format itself, not the entire docker image? And I assume you want this for clang-format 9, which is what the restyler is running? That might be a little complicated; clang-format 9 is a bit of a pain to get your hands on (at least on Mac, which is what I have to work with).

If you've run the restyle locally and are sure Restyled wouldn't change anything it were to succeed, can you not just ignore the error for this particular PR?

No, because the restyle job is marked as a required CI job for the relevant repository. I'd have to find one of the admins to admin-merge it, which is ... a lot of overhead. And I am getting this for 5 different PRs at this point, and likely more to come.

@pbrisbin
Copy link
Member

OK, so you just want the memory used by clang-format itself, not the entire docker image?

Either is fine; though if you found a marked difference between the two that would be useful information! As I mentioned, the docker image does absolutely nothing besides run clang-format so I don't expect there to be any memory overhead to it.

No, because the restyle job is marked as a required CI job for the relevant repository. I'd have to find one of the admins to admin-merge it.

I see. Well I'm honored to be a required status!

As a workaround, you could disable clang-format in your .restyled.yaml, get the PR(s) in, and then re-enable it.

And I am getting this for 5 different PRs at this point, and likely more to come.

Well this is definitely interesting and points to something systemic that is wrong with the underlying instances where this stuff runs. But that's unlikely since natural scaling activities have replaced the instances multiple times, as recently as 5PM ET. I'll have to continue digging.

@pbrisbin
Copy link
Member

natural scaling activities have replaced the instances multiple times, as recently as 5PM ET

Wait, I misread a graph! The current instance has been around a while (though it certainly seems healthy). I'll rotate it out and see if that helps.

@bzbarsky-apple
Copy link
Author

For what it's worth, at this point we've stopped restyling the zzz_generated files, so that issue is addressed.

That said, when I run restyled locally on an Intel Mac on the set of files changed in project-chip/connectedhomeip@52b1eb0 and project-chip/connectedhomeip@4abb1f0 and project-chip/connectedhomeip@47e37c9 (which is the relevant set for the failure above) I see the docker claiming that the memory for the image goes up to 490MB (it's sampling, I guess, so not sure whether it goes higher). So it's pretty borderline if 512MB is the cap....

@pbrisbin
Copy link
Member

What command do you use to run restyled locally?

@pbrisbin
Copy link
Member

pbrisbin commented Jan 18, 2023

Based on my metrics, things seem better since I rotated the instance. It's also possible you just aren't running many jobs yet (it's early in ET at least).

@bzbarsky-apple
Copy link
Author

What command do you use to run restyled locally?

docker run -it --rm \
  --env DEBUG=1 \
  --env HOST_DIRECTORY="$PWD" \
  --env UNRESTRICTED=1 \
  --volume "$PWD":/code \
  --volume /tmp:/tmp \
  --volume /var/run/docker.sock:/var/run/docker.sock \
  --entrypoint restyle-path \
  "restyled/restyler:master-final" file names go here

It's also possible you just aren't running many jobs yet

We also stopped restyling the generated files...

@pbrisbin
Copy link
Member

I you remove --env UNRESTRICTED=1 that would apply the 512MB limit locally too. If it also fails there, that would rule out a problem on Restyler's infrastructure. If it does not, that'd be further evidence it was bad instance state (which I've cleared).

Are things looking better on your end? I'm not showing any failures in your project since 7:40PM ET last night, which is basically when I rotated the instance.

@bzbarsky-apple
Copy link
Author

I you remove --env UNRESTRICTED=1 that would apply the 512MB limit locally too.

That doesn't seem to fail, on a set of files that CI failed on....

On our end, we are no longer restyling the generated files, so I would not expect this to come up as an issue much anymore.... We can probably just close this.

@pbrisbin
Copy link
Member

We can probably just close this.

Sounds good.

Based on the evidence so far, it seems like something was going on with the underlying EC2 instance and rotating it out was the solution. Skipping generated files is still a Good Thing, generally speaking though, to avoid these sorts of issues when they're not caused by something systemic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants