Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buildtest tutorial on perlmutter #1338

Merged
merged 9 commits into from
Jan 24, 2023
Merged

buildtest tutorial on perlmutter #1338

merged 9 commits into from
Jan 24, 2023

Conversation

shahzebsiddiqui
Copy link
Member

@shahzebsiddiqui shahzebsiddiqui commented Jan 11, 2023

In preparation for the buildtest tutorial at ECPAM this PR as a first draft of the buildtest tutorial on Perlmutter. The docs was rendered in the PR see https://buildtest--1338.org.readthedocs.build/en/1338/buildtest_perlmutter.html or look at the CI checks below

@jscook2345: Can you please help peer-review this MR in preparation for the tutorial.

@wspear: I would like to get your thoughts on this as well. I wasn't sure to what extent we can cover 'E4S Testsuite' and 'Spack Test' on Perlmutter. Note that there will be a hands-on tutorial covering the buildtest-spack integration in https://buildtest.readthedocs.io/en/devel/buildspecs/spack.html that is performed in the container.

@codecov
Copy link

codecov bot commented Jan 11, 2023

Codecov Report

Base: 71.03% // Head: 71.03% // No change to project coverage 👍

Coverage data is based on head (98ccb59) compared to base (b928070).
Patch has no changes to coverable lines.

❗ Current head 98ccb59 differs from pull request most recent head 9e006cc. Consider uploading reports for the commit 9e006cc to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##            devel    #1338   +/-   ##
=======================================
  Coverage   71.03%   71.03%           
=======================================
  Files          57       57           
  Lines        6130     6130           
  Branches     1090     1090           
=======================================
  Hits         4354     4354           
  Misses       1774     1774           
  Partials        2        2           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@pull-request-size pull-request-size bot added size/L and removed size/M labels Jan 11, 2023
@shahzebsiddiqui shahzebsiddiqui self-assigned this Jan 13, 2023
@shahzebsiddiqui shahzebsiddiqui added the documentation documentation fix label Jan 23, 2023
@prathmesh4321
Copy link
Collaborator

Hi @shahzebsiddiqui . I see few changes to be made.

  1. The link to the install buildtest mentioned here is broken I guess. It gives 404 error. See https://buildtest--1338.org.readthedocs.build/en/1338/buildtest_perlmutter.html#:~:text=Next%2C%20you%20should%20Install%20buildtest%20by%20cloning%20the%20repository%20in%20your%20%24HOME%20directory.

  2. The path to clone here https://buildtest--1338.org.readthedocs.build/en/1338/buildtest_perlmutter.html#:~:text=HOME%0Agit%20clone-,https%3A//github.com/buildtesters/buildtest,-%24HOME/buildtest%2Dnersc should be for buildtest-nersc repo instead of the buildtest repo.

  3. While running the "buildtest build" command in Exercise 1, the path should be "$BUILDTEST_
    ROOT/perlmutter_tutorial/ex1/hostname.yml --pollinterval=10" instead of "perlmutter_tutorial/ex1/hostname.yml --pollinterval=10" ? See https://buildtest--1338.org.readthedocs.build/en/1338/buildtest_perlmutter.html#:~:text=perlmutter_tutorial/ex1/hostname.yml%20%2D%2Dpollinterval%3D10

@shahzebsiddiqui
Copy link
Member Author

Hi @shahzebsiddiqui . I see few changes to be made.

  1. The link to the install buildtest mentioned here is broken I guess. It gives 404 error. See https://buildtest--1338.org.readthedocs.build/en/1338/buildtest_perlmutter.html#:~:text=Next%2C%20you%20should%20Install%20buildtest%20by%20cloning%20the%20repository%20in%20your%20%24HOME%20directory.
  2. The path to clone here https://buildtest--1338.org.readthedocs.build/en/1338/buildtest_perlmutter.html#:~:text=HOME%0Agit%20clone-,https%3A//github.com/buildtesters/buildtest,-%24HOME/buildtest%2Dnersc should be for buildtest-nersc repo instead of the buildtest repo.
  3. While running the "buildtest build" command in Exercise 1, the path should be "$BUILDTEST_
    ROOT/perlmutter_tutorial/ex1/hostname.yml --pollinterval=10" instead of "perlmutter_tutorial/ex1/hostname.yml --pollinterval=10" ? See https://buildtest--1338.org.readthedocs.build/en/1338/buildtest_perlmutter.html#:~:text=perlmutter_tutorial/ex1/hostname.yml%20%2D%2Dpollinterval%3D10

thanks for catching these mistakes. I made the corrections. Note you can make comments in-line if you click the review button you can comment directly on the line number. You may find this link https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/reviewing-proposed-changes-in-a-pull-request helpful for reviewing PRs on Github

Copy link
Collaborator

@prathmesh4321 prathmesh4321 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shahzebsiddiqui shahzebsiddiqui merged commit 663fc85 into devel Jan 24, 2023
@shahzebsiddiqui shahzebsiddiqui deleted the buildtest_tutorial branch January 24, 2023 02:26
@shahzebsiddiqui shahzebsiddiqui linked an issue Jan 24, 2023 that may be closed by this pull request
@@ -49,7 +49,7 @@ is processed.
:scale: 75 %


For every discovered buildspecs, buildtest will validate the buildspecs in the :ref:`parse stage <_parse_buildspecs>` to
For every discovered buildspecs, buildtest will validate the buildspecs in the :ref:`parse stage <parse_buildspecs>` to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe these should be buildspec instead of buildspecs since you're talking about the singular here.

Buildtest Tutorial on Perlmutter
===================================

This tutorial will be conducted on `Perlmutter <https://docs.nersc.gov/systems/perlmutter/>`_ system. If you need account access please
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: 'conducted on the Perlmutter system.'

Setup
------

Once you have a NERSC account, you can `connect to NERSC system <https://docs.nersc.gov/connect/>`_. You will need access to a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: 'you can connect to any NERSC system.

------

Once you have a NERSC account, you can `connect to NERSC system <https://docs.nersc.gov/connect/>`_. You will need access to a
terminal client and ssh into perlmutter as follows::
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: 'You will need access to ssh. With that you can connect to perlmutter as follows'

Once you have a NERSC account, you can `connect to NERSC system <https://docs.nersc.gov/connect/>`_. You will need access to a
terminal client and ssh into perlmutter as follows::

ssh <user>@perlmutter-p1.nersc.gov
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe tell them about the using the MFA + password here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mfa not required for training accounts


module load python

Next, you should :ref:`Install buildtest <installing_buildtest>` by cloning the repository in your $HOME directory.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: 'cloning the repository into your home directory'

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest giving them a git command to do this


Next, you should :ref:`Install buildtest <installing_buildtest>` by cloning the repository in your $HOME directory.

Once you have buildtest setup, please clone the following repository https://github.com/buildtesters/buildtest-nersc in your $HOME directory as follows::
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing instructions to setup buildtest

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: 'please clone the following repository into your home directory:'

Since the git command has the full repo you don't need to list it twice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah so the setup requires them to read the Installing buildtest page, this includes creating a python virtual environment and sourcing the setup script. Instead of redocumenting i just put link to page.

Exercise 1: Running a Batch Job
--------------------------------

In this exercise, we will submit a batch job that will run `hostname` in the slurm cluster. Shown below is the example buildspec
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: 'will run the hostname command'.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: 'Here is an example buildspec'

.. literalinclude:: ../perlmutter_tutorial/ex1/hostname.yml
:language: yaml

Let's run this test and poll interval for 10 secs::
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: "Let's run this test with a poll interval of ten seconds"


buildtest build -b $BUILDTEST_ROOT/perlmutter_tutorial/ex1/hostname.yml --pollinterval=10

Once test is complete, check the output of test by running::
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: "Once the test is complete, you can check the output of the test by running"


buildtest inspect query -o hostname_perlmutter

Next, let's update the test such that it runs on both **regular** and **debug** queue. You will need to update the **executor** property and
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: 'runs on both the...'

specify a regular expression. Please refer to :ref:`Multiple Executors <multiple_executors>` for reference. You can retrieve a list of available executors
by running ``buildtest config executors``.

Once you have updated the test, please rerun the test, now you should expect to see two runs for same test.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: "Once you have updated and re-run the test, you should now see two results"

And additionally maybe show an example result

Exercise 2: Performing Status Check
------------------------------------

In this exercise, we will check version of Lmod via environment **LMOD_VERSION** and specify the
Copy link
Collaborator

@jscook2345 jscook2345 Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: "... check the version of ..."

Suggestion: "... Lmod using the environment variable ..."

Suggestion: "... and specifying the output using a ..."

.. literalinclude:: ../perlmutter_tutorial/ex2/module_version.yml
:language: yaml

This buildspec is invalid, your first task is to make sure buildspec is valid. Once you have accomplished this task, try building
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the idea of mixing an invalid buildspec with a new type of test. I'd rather repeat with a broken spec, but that's up to you

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: "This buildspec is invalid. Your first task is to fix it."

:language: yaml

This buildspec is invalid, your first task is to make sure buildspec is valid. Once you have accomplished this task, try building
the test and check the output of test. If your test passes, try updating the regular expression and see if test fails. Revert the change
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: "...try building the test and verifying its output."


buildtest buildspec find --root $HOME/buildtest-nersc/buildspecs --rebuild -q

In this task you will be required to do the following
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You give them instructions but you do not tell them how to do any of it. Is that intended?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it was intended for them to do this exercise by learning how to use buildtest buildspec commands. We will cover the first part of tutorial on command line that is being covered in #1353

In this task you will be required to do the following

1. Find all tags
2. List all filter and format fields
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filters


1. Find all tags
2. List all filter and format fields
3. Format table via fields ``name``, ``description``
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tables

1. Find all tags
2. List all filter and format fields
3. Format table via fields ``name``, ``description``
4. Filter buildspec by tag ``e4s``
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildspecs

Exercise 4: Querying Test Reports
----------------------------------

In this exercise you will be learn how to :ref:`query test report <test_reports>`. This can be done by
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reports

In this exercise you will be learn how to :ref:`query test report <test_reports>`. This can be done by
running ``buildtest report``. In this task please do the following

1. List all filter and format fields
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filters

running ``buildtest report``. In this task please do the following

1. List all filter and format fields
2. Query all test by returncode 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests


1. List all filter and format fields
2. Query all test by returncode 0
3. Query all test by tag ``e4s``
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests

1. List all filter and format fields
2. Query all test by returncode 0
3. Query all test by tag ``e4s``
4. Print total count of failed tests
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Print the total count of all failed tests

3. Query all test by tag ``e4s``
4. Print total count of failed tests

Let's upload the test to CDASH by running the following::
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the user be able to upload to cdash without a token of some sort?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep this should work.


buildtest cdash upload $USER-buildtest-tutorial

Take some time to analyze the output in CDASH by opening the link including PASS/FAIL test.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove 'including PASS/FAIL test'

Exercise 5: Specifying Performance Checks
--------------------------------------------

In this task, we will using :ref:`performance checks <perf_checks>` to determine state of test.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will be using

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to determine what? 'state of test' does not make sense to me in context of performance tests. What should go here?


In this task, we will using :ref:`performance checks <perf_checks>` to determine state of test.
In this exercise, we will be running the STREAM benchmark. Shown below is an example buildspec that you
will be working with
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to previous line

buildtest inspect query -o stream_test

Take a close look at the metrics value. In this task, you are requested to use use :ref:`assert_ge` with metric ``copy`` and
``scale`` with reference value. For reference value please experiment with different metrics and see if test pass/fail.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...with a reference value.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the reference value...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see if the test passes or fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation documentation fix size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

buildtest tutorial on Perlmutter
3 participants