Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduled Link Checks of OSCAL Repo's Main Branch and Automatically Open Issues for Broken Links #1230

Closed
3 of 5 tasks
aj-stein-nist opened this issue May 4, 2022 · 3 comments · Fixed by #1231 or #1239
Closed
3 of 5 tasks
Assignees
Labels
Developer Experience Issues around enhancing and optimizing work for development of NIST OSCAL artifacts enhancement Scope: CI/CD Enhancements to the project's Continuous Integration and Continuous Delivery pipeline. Scope: Website Issues targeted at the OSCAL project website. User Story
Milestone

Comments

@aj-stein-nist
Copy link
Contributor

aj-stein-nist commented May 4, 2022

User Story:

As an OSCAL tool developer, in order to know all internal and external hyperlinks are valid over time and not only when specific developers make modifications that are sometimes not related to modified links, I want the OSCAL's CI/CD automation to periodically examine links for the OSCAL website and repo Markdown documentation for broken links on a schedule. If a broken link is found, I would like a new issue to be automatically opened indicating which link should be subsequently handled by an available developer.

Goals:

Improve the OSCAL CI/CD system so that broken links can be detected outside of a developer code/test/push cycle, which often might not be related to doc and website improvement.

Dependencies:

Complete after #1208 is complete.

Acceptance Criteria

  • Add a GitHub Actions workflow to schedule a cron job for the markdown-link-check action for Markdown docs outside the website content.
  • Add a GitHub Actions workflow to schedule a cron job for the lychee-action action for OSCAL website content link checks.
  • All OSCAL website and readme documentation affected by the changes in this issue have been updated. Changes to the OSCAL website can be made in the docs/content directory of your branch.
  • A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.
@aj-stein-nist aj-stein-nist added enhancement User Story Scope: Website Issues targeted at the OSCAL project website. Scope: CI/CD Enhancements to the project's Continuous Integration and Continuous Delivery pipeline. Developer Experience Issues around enhancing and optimizing work for development of NIST OSCAL artifacts labels May 4, 2022
@aj-stein-nist aj-stein-nist added this to the Backlog milestone May 4, 2022
@david-waltermire david-waltermire modified the milestones: Backlog, OSCAL 1.0.4 May 5, 2022
@aj-stein-nist
Copy link
Contributor Author

aj-stein-nist commented May 10, 2022

Ok, @david-waltermire-nist here is the status report as I am at a juncture where I need your critical feedback and input. 😅

  • Cron job for lychee is better and close to done for website content
  • I have to decide what to do, for cron and generally with markdown-link-check. I reviewed our goals and AC, and some challenges we have:
    • markdown-link-check and the GHA mlc action have no native reporting format, only output to console.
    • Both lychee and markdown-link-check can be used as a library (nice work from the lychee devs with a feature comparison matrix, but the core of the latter doesn't have any reporting capability but a callback mechanism.
    • I can work on improving markdown-link-check and/or the action
    • I can pivot to lychee
    • I just ran lychee locally with xargs on only the markdown files (git ls-files "*/*.md" -z | grep --null-data -v "^docs/" | xargs -0 lychee --exclude-file ./build/config/.lycheeignore --verbose --accept 200,206,429 --no-progress) and the only issue was ironically something MLC didn't catch before in build/README.md with templated strings like https://github.com/ndw/xmlcalabash1/releases/download/$%7BCALABASH_VERSION%7D/xmlcalabash-$%7BCALABASH_VERSION%7D.zip and https://github.com/gohugoio/hugo/releases/download/v$%7BHUGO_VERSION%7D/hugo_extended_$%7BHUGO_VERSION%7D_Linux-64bit.deb.

Wasn't there a reason we didn't want to use lychee for Markdown content? I spent some time noodling on that but I cannot remember the specific issue.

@aj-stein-nist
Copy link
Contributor Author

Now I remember, it is the local relative path rewriting feature we use for GitHub links to a repo's issue board, currently in the mlc config. I guess I sync up with Dave letter about which 1/2 we use to move forward and resolve this soon (or maybe skip checking Markdown links in cron schedule fashion altogether).

@aj-stein-nist
Copy link
Contributor Author

I also figured out a faster turnaround workaround by just removing the -q argument from markdown-link-check and taking that as piped stream output from the command-line execution in a labelled output variable (StackOverflow reference) for that and then insert that named output into a GitHub issue when the returned status code is not 0.

Also, from quick sitrep with Dave, need to fix the wildcarding from git ls-files "*/*.md" to git ls-files "*.md" to properly catch top-level README, which we are missing. Also need to add the "pattern": "^#.*" to ensure we do not hit bare anchor links in that README or it will bomb, and we cannot really count on those validating at PR time because of how GitHub dynamically generates those anchors isn't always reflected in the local copies in a 1:1 not false positive inducing way. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Developer Experience Issues around enhancing and optimizing work for development of NIST OSCAL artifacts enhancement Scope: CI/CD Enhancements to the project's Continuous Integration and Continuous Delivery pipeline. Scope: Website Issues targeted at the OSCAL project website. User Story
Projects
None yet
2 participants