Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release Notes tracker for component repos #2438

Merged
merged 1 commit into from
Aug 22, 2022

Conversation

prudhvigodithi
Copy link
Collaborator

@prudhvigodithi prudhvigodithi commented Aug 8, 2022

Description

  • Release tracker for component repos, thats gives information about the most recent commitID after a specific date and checks if there are any associated release notes.
    Expected inputs:
  1. GIT_LOG_DATE: To check if commit exists after a specific date
  2. ADD_COMMENT: true/false
  3. GIT_ISSUE_NUMBER: release issue to add the generated table as comment.
  4. GITHUB_TOKEN: token used to add the comment
  • This PR adds code that adheres to existing workflows

  • Execution command: ./run_releasenotes_check.sh manifests/2.2.0/opensearch-2.2.0.yml

  • Create a mark-down table as follows, adding the latest commitID after a certain date, and if contains a release notes with YES/NO/NULL

|    Repo    |Branch|CommitID|Release Notes|
|------------|------|--------|-------------|
|OpenSearch  |2.x   |        |NO           |
|common-utils|2.x   |        |NO           |
|security    |main  |        |YES          |
  • Pending: To update README

Issues Resolved

#2345

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@codecov-commenter
Copy link

codecov-commenter commented Aug 8, 2022

Codecov Report

Merging #2438 (c2cd87b) into main (0720e05) will decrease coverage by 0.27%.
The diff coverage is 88.80%.

❗ Current head c2cd87b differs from pull request most recent head 4bcf433. Consider uploading reports for the commit 4bcf433 to get more accurate results

@@             Coverage Diff              @@
##               main    #2438      +/-   ##
============================================
- Coverage     94.63%   94.35%   -0.28%     
  Complexity       27       27              
============================================
  Files           213      218       +5     
  Lines          4324     4449     +125     
  Branches         29       29              
============================================
+ Hits           4092     4198     +106     
- Misses          226      245      +19     
  Partials          6        6              
Impacted Files Coverage Δ
src/run_releasenotes_check.py 41.17% <41.17%> (ø)
src/release_notes_workflow/release_notes.py 91.66% <91.66%> (ø)
.../release_notes_workflow/release_notes_component.py 96.96% <96.96%> (ø)
src/git/git_commit.py 100.00% <100.00%> (ø)
src/git/git_repository.py 94.11% <100.00%> (+1.13%) ⬆️
...release_notes_workflow/release_notes_check_args.py 100.00% <100.00%> (ø)
src/system/temporary_directory.py 85.71% <0.00%> (-11.43%) ⬇️
src/system/os.py 93.75% <0.00%> (-6.25%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is a good start, but breaks several patterns in the python code in this repo.

  1. Arguments are passed as ENV variables, where they should be passed as --arguments. Create an arguments class.
  2. It's not object oriented, mixes many concerns, making it quite messy and difficult to extend. Follow the workflow pattern we have, start by adding a class that encapsulates the concept of release notes, another to get a collection of commits (each an instance of a class) after a date, etc.
  3. Python has nice OO ways to manipulate git, I googled https://gitpython.readthedocs.io/en/stable/tutorial.html, get rid of all command shell calls to git.

I don't think this needs to be a Jenkins workflow, consider GHA? The commenting on a GitHub issue can be done in 2 lines with an existing GitHub action.

@prudhvigodithi
Copy link
Collaborator Author

prudhvigodithi commented Aug 8, 2022

This code is a good start, but breaks several patterns in the python code in this repo.

  1. Arguments are passed as ENV variables, where they should be passed as --arguments. Create an arguments class.
  2. It's not object oriented, mixes many concerns, making it quite messy and difficult to extend. Follow the workflow pattern we have, start by adding a class that encapsulates the concept of release notes.

I don't think this needs to be a Jenkins workflow, consider GHA? The commenting on a GitHub issue can be done in 2 lines with an existing GitHub action.

Hey @dblock
The reason I went with ENV for few values is I have extended the existing checkout workflow (checkoutArgs), so that way I dont have to create a new args for this minor workflow.
Also we need a jenkins job here as the input is the DATE (which us unknown to automate as the release dates could change) to get the latest the commitID after the DATE and check if release notes exists or not, with jenkins job its easy to run (by a release manager) during the release lifecycle (code freeze time), to check if the commit exists after a date (passed as an input), followed by to check if there exists a release notes.

./run_releasenotes_check.sh manifests/2.2.0/opensearch-2.2.0.yml with one arg (input manifest) and by ENV the GIT_LOG_DATE, the commitID and release notes can be found, finally if we want to add this info to an issue using GIT_ISSUE_NUMBER should we solved by the code, which limits the new arguments class.
andThank you

@prudhvigodithi
Copy link
Collaborator Author

prudhvigodithi commented Aug 8, 2022

3. Python has nice OO ways to manipulate git, I googled https://gitpython.readthedocs.io/en/stable/tutorial.html, get rid of all command shell calls to git.

This will fail with our code setup, which I have tried initially as once we import gitpython the library is called as git, but this will fail as we already have something called src/git/
with gh cli its easy to hit the GitHub API as well.
@dblock

@dblock
Copy link
Member

dblock commented Aug 8, 2022

Hey @dblock The reason I went with ENV for few values is I have extended the existing checkout workflow (checkoutArgs), so that way I dont have to create a new args for this minor workflow.

This is not a good reason, you're just acquiring technical debt.

Also we need a jenkins job here as the input is the DATE (which us unknown to automate as the release dates could change) to get the latest the commitID after the DATE and check if release notes exists or not, with jenkins job its easy to run (by a release manager) during the release lifecycle (code freeze time), to check if the commit exists after a date (passed as an input), followed by to check if there exists a release notes.

Ok

./run_releasenotes_check.sh manifests/2.2.0/opensearch-2.2.0.yml with one arg (input manifest) and by ENV the GIT_LOG_DATE, the commitID and release notes can be found, finally if we want to add this info to an issue using GIT_ISSUE_NUMBER should we solved by the code, which limits the new arguments class. andThank you

I don't understand this. All our scripts use command line arguments. You're creating something that doesn't. There's no good reason for it.

@dblock
Copy link
Member

dblock commented Aug 8, 2022

  1. Python has nice OO ways to manipulate git, I googled https://gitpython.readthedocs.io/en/stable/tutorial.html, get rid of all command shell calls to git.

This will fail with our code setup, which I have tried initially as once we import gitpython the library is called as git, but this will fail as we already have something called src/git/ with gh cli its easy to hit the GitHub API as well. @dblock

Use from gitpython import ... and you don't have this problem. And if you still do, fix it by moving the existing git folder to a different one. There's about 100 reasons to use a library. For example, you will get a proper error when something doesn't work. It's easier to mock/stub/test. Don't create more technical debt, please.

@prudhvigodithi
Copy link
Collaborator Author

prudhvigodithi commented Aug 8, 2022

Thanks for the suggestions @dblock
I'm fine with having separate args for this workflow and agree which makes it clean, but I was doubtful if that will be pushed or not as this workflow mostly is not shared and for specific use case hence went with ENV approach. Secondly from gitpython import ... this wont work as well, as we have src/git/ which forces to use this local git, I myself against creating technical debt as well :)
I dont think so with GitPython the library out there support API calls to GITHUB (create/comment on an issue etc), its good for normal git cli operations, gh cli is more easy and better with less code here, I will explore more on this and update in my next commit.
Thank you
@bbarani @peterzhuamazon

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's quite a bit more work to do to make this well organized.

You have a list of components (already implemented). Each is checked out into a folder (needs to be implemented). Each may have release notes (to be implemented). Don't put all this unrelated logic into a script. If all these are properties of classes then you'll end up with a very simple runner and an easy to test and extend implementation. Give it a try.

src/releasenotes_check_workflow/releasenotes_check_args.py Outdated Show resolved Hide resolved
src/releasenotes_check_workflow/releasenotes_check_args.py Outdated Show resolved Hide resolved
src/releasenotes_check_workflow/releasenotes_check_args.py Outdated Show resolved Hide resolved
src/run_releasenotes_check.py Outdated Show resolved Hide resolved
@prudhvigodithi
Copy link
Collaborator Author

prudhvigodithi commented Aug 9, 2022

Hey @dblock I have modified the code to make sure its separated with classes, methods and using separate args.
Thank you

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See below for next steps.

Pipfile Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
run_releasenotes_check.sh Outdated Show resolved Hide resolved
src/releasenotes_check_workflow/README.md Outdated Show resolved Hide resolved
src/releasenotes_check_workflow/releasenotes_check.py Outdated Show resolved Hide resolved
@peterzhuamazon
Copy link
Member

Hi,

This is just a list of improvements for later:

  1. Link to actual release notes if exists, easy to track.
  2. Adding the ability to generate final release notes so we dont need to use the legacy script anymore.
  3. Ability to edit the original comment instead of creating a new comment.

Thanks.

@prudhvigodithi
Copy link
Collaborator Author

prudhvigodithi commented Aug 9, 2022

Hi,

This is just a list of improvements for later:

  1. Link to actual release notes if exists, easy to track.
  2. Adding the ability to generate final release notes so we dont need to use the legacy script anymore.
  3. Ability to edit the original comment instead of creating a new comment.

Thanks.

Hey @peterzhuamazon
yes 1. is possible, 2. we have to extent this release_notes_workflow to generate final release notes , 3. using gh cli to update a comment, there is an open issue which says editing of the comment is not possible with current version cli/cli#3613 (Should research on some other solution).

@dblock
Copy link
Member

dblock commented Aug 10, 2022

Hey @peterzhuamazon yes 1. is possible, 2. we have to extent this release_notes_workflow to generate final release notes , 3. using gh cli to update a comment, there is an open issue cli/cli#3613 (Should research on some other solution).

Part (3) is probably indicative of us trying to over-engineer something. There's already a tool for adding or editing a comment on GitHub called gh. Unix has support for pipes where output of one tool can be piped into another tool. Or a file can be created that becomes input to another tool. Let's ask ourselves what business value we are adding by doing the additional work of taking input parameters such as GitHub token, issue URL, and a flag to add a comment, then shelling out to a tool that does it? Seems to me that just calling an existing tool is a lot simpler.

@prudhvigodithi
Copy link
Collaborator Author

prudhvigodithi commented Aug 10, 2022

Part (3) is probably indicative of us trying to over-engineer something. There's already a tool for adding or editing a comment on GitHub called gh. Unix has support for pipes where output of one tool can be piped into another tool. Or a file can be created that becomes input to another tool. Let's ask ourselves what business value we are adding by doing the additional work of taking input parameters such as GitHub token, issue URL, and a flag to add a comment, then shelling out to a tool that does it? Seems to me that just calling an existing tool is a lot simpler.

Hey @dblock, for creation of comment we need a right github repo URL, issue number and token to authenticate the repo, Initially I have passed via environmental values but inclining with our workflows I have created a new args class ReleaseNotesCheckArgs and passed as arguments, these are the inputs expected by gh cli, so what approach you suggest here instead of passing GitHub token and issue URL? Coming to --addcomment :)
its an extra option provided, use --addcomment to proceed with GH issue commenting, so that it supports to execute the workflow with/without adding comment to github, so you say this extras option is not worth?

@dblock
Copy link
Member

dblock commented Aug 10, 2022

Part (3) is probably indicative of us trying to over-engineer something. There's already a tool for adding or editing a comment on GitHub called gh. Unix has support for pipes where output of one tool can be piped into another tool. Or a file can be created that becomes input to another tool. Let's ask ourselves what business value we are adding by doing the additional work of taking input parameters such as GitHub token, issue URL, and a flag to add a comment, then shelling out to a tool that does it? Seems to me that just calling an existing tool is a lot simpler.

Hey @dblock, for creation of comment we need a right github repo URL, issue number and token to authenticate the repo, Initially I have passed via environmental values but inclining with our workflows I have created a new args class ReleaseNotesCheckArgs and passed as arguments, these are the inputs expected by gh cli, so what approach you suggest here instead of passing GitHub token and issue URL? Coming to --addcomment :) its an extra option provided, use --addcomment to proceed with GH issue commenting, so that it supports to execute the workflow with/without adding comment to github, so you say this extras option is not worth?

I propose not to do this in this tool, not to add --addcomment or any related parameters, and use gh directly in the Jenkins workflow to add the output from release_notes.sh check ... to a GitHub issue.

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good.

The major pieces of feedback is in tests that need to stub execution of git commands instead of doing actual git commands.

src/release_notes_workflow/component_release_notes.py Outdated Show resolved Hide resolved
src/release_notes_workflow/release_notes.py Show resolved Hide resolved
src/run_releasenotes_check.py Outdated Show resolved Hide resolved
tests/test_run_releasenotes_check.py Outdated Show resolved Hide resolved
tests/tests_git/test_git_repository.py Outdated Show resolved Hide resolved

gitLogDate = '2022-07-26'

gitLogDateAssert = datetime.datetime.strptime(gitLogDate, "%Y-%m-%d").date()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're not reusing variables, don't declare them as globals. Also camelCase is not Python.

)
)

gitLogDate = '2022-07-26'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just hard-code it in each of the tests, just like the word "check" for example. The value only matters in 1 of the tests and it's easier to read the tests individually.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you resolved this, but 1) it makes tests hard to read 2) it's still camelCase which is not Python. Change gitLogDate below to the actual date "2022-07-26", there's no need for this variable.

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting closer. The big missing piece is that tests lump multiple parts under test in basically two tests in test_releasenotes_check_function. That needs to be broken up and unit tests written for each separate class that was added or changed.

src/release_notes_workflow/README.md Outdated Show resolved Hide resolved
tests/test_run_releasenotes_check.py Outdated Show resolved Hide resolved
tests/tests_git/test_git_repository.py Show resolved Hide resolved
)
)

gitLogDate = '2022-07-26'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you resolved this, but 1) it makes tests hard to read 2) it's still camelCase which is not Python. Change gitLogDate below to the actual date "2022-07-26", there's no need for this variable.

@prudhvigodithi
Copy link
Collaborator Author

Hey @dblock I have addressed the suggested changes, can you review back again?
Thank you

@prudhvigodithi prudhvigodithi added the feature New feature label Aug 19, 2022
@prudhvigodithi prudhvigodithi self-assigned this Aug 19, 2022
Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's quite a bit of work to get the tests to actually test what is intended in this code.

tests/test_run_releasenotes_check.py Outdated Show resolved Hide resolved
tests/test_run_releasenotes_check.py Outdated Show resolved Hide resolved
tests/tests_git/test_git_repository.py Outdated Show resolved Hide resolved
tests/tests_git/test_git_repository.py Show resolved Hide resolved
tests/tests_release_notes_workflow/test_release_notes.py Outdated Show resolved Hide resolved
tests/tests_release_notes_workflow/test_release_notes.py Outdated Show resolved Hide resolved
Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Fix CI on Windows, needs an os.path.join instead of a hard-coded /.
  2. See below.

Signed-off-by: prudhvigodithi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants