Skip to content
This repository has been archived by the owner on May 10, 2023. It is now read-only.

[flux-core v0.13] example2 update: Update README.md, Python scripts #40

Merged
merged 5 commits into from
Jan 21, 2020

Conversation

cmoussa1
Copy link
Member

This PR makes the following changes:

  1. README.md
  • Update format of instructions from bullet points to numbered steps
  • Changed code instructions from bold font to in-line code
  • Removed step to set FLUX_SCHED_OPTIONS environment variable
  1. submitter.py and submitter2.py
  • Updated scripts to utilize new Jobspec format, Python bindings (thanks a lot to @SteVwonder for all of his help to get these working! 🎉)
  1. Lua scripts
  • Since the Lua bindings have been deprecated, the Lua submission scripts, as well as compute and io-forwarding scripts have been removed.

The [WIP] tag is here because I'm not sure if we want to make changes to the new Python scripts so that they are Black compatible. If we want to do that, I'd be happy to update them, and then drop the [WIP] tag.

@grondo
Copy link
Contributor

grondo commented Dec 17, 2019

I wonder if we could try out using GitHub Actions to run black formatter on examples in this repo, rather than setting up Travis.

https://github.com/marketplace/actions/black-code-formatter

@cmoussa1
Copy link
Member Author

@grondo. I like that suggestion; I will look into setting that up. In the meantime, I blackened the Python files manually in my forked repo and pushed those changes! I think I'll drop the [WIP] tag too, then.

@cmoussa1 cmoussa1 changed the title [WIP][flux-core-v0.13] example2 update: Update README.md, Python scripts [flux-core-v0.13] example2 update: Update README.md, Python scripts Dec 17, 2019
Copy link
Contributor

@grondo grondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just commenting with a few ideas. I'm not sure if they are the right approach for these examples, or if we should save the more verbose descriptions for something more like a tutorial. So take or leave the comments, or discuss more here 😄


- **setenv FLUX_SCHED_OPTIONS "node-excl=true"** # Make sure the scheduler module will do node-exclusive scheduling
2. `srun --pty --mpi=none -N3 flux start -o,-S,log-filename=out`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be more instructive if a little more prose was included with these steps, i.e. make the style more tutorial-like.

e.g. instead of

  1. salloc -N3 -p pdebug
  2. srun --pty --mpi=none -N3 flux start -o,-S,log-filename=out

Something like

  1. Allocate 3 nodes from local resource manager:
$ salloc -N3 -p pdebug
salloc: granted job allocation 1234
  1. Launch a flux instance on the current allocation by running flux start once per node, redirecting log messages to the file out in the current directory
$ srun --pty --mpi=none --ntasks-per-node=1 flux start -o,-S,log-filename=out
$

And so on..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea! Since most of the beginning of the examples are identical (requesting an allocation from a resource manager, launching a Flux instance), what if we posted a note of the most common commands with their descriptions in a top-level README, along with the titles of each example?

We could have, for instance:

flux-workflow examples

<top level readme with links, descriptions of each example>
Some Notes About The Examples
Most of the examples contain the same commands. Descriptions for these commands are as follows:

  • salloc -N3 -p pdebug: allocates 3 nodes from a local resource manager.
  • srun --pty --mpi=none --ntasks-per-node=1 flux start -o,-S,log-filename=out: launch a Flux instance on the current allocation once per node, redirecting log messages to the file out in the current directory

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My opinion would be that the explanations should be kept in-line with the examples to be most instructive. However, since as you say that will be very repetitive, maybe what we really want to work on next is to take the most interesting and general cases from these workflow examples and combine them into a single page or click-through Tutorial.

That might save us some work here, and having everything in a single place would make it easier for interested users.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Maybe this could be a good starting point for that conversation about documentation with Elsa! 😄


- **srun --pty --mpi=none -N3 /usr/global/tools/flux/toss_3_x86_64_ib/default/bin/flux start -o,-S,log-filename=out**
3. `./submitter.py`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should consider pasting/including the source of these examples directly in the README text? Then readers could get the gist of the entire example on one page, without checking out the git repo, or using extra clicks to display the source via GitHub.

I'm not sure if you can include files in markdown, so that might be a point against this idea in practice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just from a quick search of the internet, it looks like we can't include files directly in markdown, but I think that suggestion of manually including some of the source code directly in the README could be really helpful, like a "highlights of the code" section or something. Maybe the more significant portions of the source code?

In example2, for instance, including the declaration and submission of compute_jobreq:

compute_jobreq = JobspecV1.from_command(
    command=["./compute.py", "120"], num_tasks=4, num_nodes=2, cores_per_task=2
)
compute_jobreq.cwd = os.getcwd()
compute_jobreq.environment = dict(os.environ)
print(flux.job.submit(f, compute_jobreq))

We could also include a short explanation of some of the code as well.


- **./submitter.lua** # or ./submitter.py
4. List jobs in KVS:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, flux job list isn't listing jobs in KVS. Maybe "List active jobs" instead?

```

- **flux kvs get lwj.0.0.1.R_lite**
5. Get value stored under job key:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, maybe more prose here, to explain what the value we're fetching from the KVS. Maybe with some background information, e.g.

  1. Information about jobs, such as the submitted job specification, an eventlog, and the resource description format R are stored in the KVS. The data can be queried via the job-info module via the flux job info command. For example, to fetch R for a job which has been allocated resources:
$ flux job info 316703506432 R
{"version":1,"execution":{"R_lite":[{"rank":"0-1","children":{"core":"0-3"}}]}}

@cmoussa1
Copy link
Member Author

Pushed some more changes thanks to the suggestions from @grondo. A couple changes were introduced in this last commit:

  • Added more verbosity to the instructions in this walkthrough. I pretty much used @grondo's exact wording for a majority of the instructions since they were both concise and descriptive; since this specific example essentially contains two sets of the same instructions, part (b) has an abbreviated description of each instruction.
  • Added a Notes section at the bottom of the README that tries to explain some of the more important parts of the source code used in this example.

@cmoussa1 cmoussa1 changed the title [flux-core-v0.13] example2 update: Update README.md, Python scripts [flux-core v0.13] example2 update: Update README.md, Python scripts Jan 7, 2020
@dongahn
Copy link
Member

dongahn commented Jan 13, 2020

@cmoussa1: is this ready?

@cmoussa1
Copy link
Member Author

This is also ready! 👍

@dongahn
Copy link
Member

dongahn commented Jan 13, 2020

Do you need to rebase this?

@cmoussa1 cmoussa1 force-pushed the example2-update-core-v0.13 branch from 0f64fa0 to 7d1ff0c Compare January 13, 2020 23:19
@cmoussa1
Copy link
Member Author

Oops, you're right - I forgot I had to rebase the two separate README update commits. I'm sorry about that. This should be good to go now.

@dongahn
Copy link
Member

dongahn commented Jan 13, 2020

I just merged one of your PR. Please rebase once again.

@cmoussa1
Copy link
Member Author

Just ran git rebase master and got this output:

Current branch example2-update-core-v0.13 is up to date.

Do I need to do something else instead?

@dongahn
Copy link
Member

dongahn commented Jan 13, 2020

Maybe your master already have the up-to-date state... I typically do:

git fetch upstream
git rebase upstream/master
git push -f origin <the branch where you are working on this PR>

@cmoussa1 cmoussa1 force-pushed the example2-update-core-v0.13 branch from 7d1ff0c to b025ce4 Compare January 13, 2020 23:35
@cmoussa1
Copy link
Member Author

Ah! I needed to add the remote upstream branch; then git rebase properly updated my branch with those changes. Thanks for clearing that up for me.

@cmoussa1 cmoussa1 force-pushed the example2-update-core-v0.13 branch from b025ce4 to 794308c Compare January 14, 2020 23:02
@cmoussa1
Copy link
Member Author

Rebased and force pushed to get up to date with master after PR #46.

@cmoussa1 cmoussa1 force-pushed the example2-update-core-v0.13 branch from 794308c to 45f21ca Compare January 15, 2020 20:40
@cmoussa1
Copy link
Member Author

Rebased and force pushed to get up to date with master after PR #47.

Update submitter.py with new Jobspec
format, remove RPC calls and get_environment()
method
Update new Jobspec format, remove
RPC calls and get_environment method
Remove Lua scripts since their bindings
have been deprecated
@cmoussa1 cmoussa1 force-pushed the example2-update-core-v0.13 branch from 45f21ca to fd4ec33 Compare January 16, 2020 15:53
@cmoussa1
Copy link
Member Author

Rebased and force pushed to get up to date with master after PR #49.

Copy link
Member

@dongahn dongahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just one minor enhancement request :-) I really like that you added the note section into the read me by the way @cmoussa1. Thanks.



- The following constructs a job request using the **JobspecV1** class with customizable parameters for how you want to utilize the resources allocated for your job:
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mark this with ```python so that the snippet will be color coded when rendered?

Update README with new command list,
job listing format

Change instructions from bullet points
to numbered steps with in-line code

Fix alignment of columns in job listing
table to look nicer

Update README with more verbose
description of instructions and Flux
commands. Introduce 'Notes' section
that will highlight snippets of relevant
code utilizing Flux and its API.
Blacken Python files
@cmoussa1 cmoussa1 force-pushed the example2-update-core-v0.13 branch from 6fd266c to a62c577 Compare January 16, 2020 18:00
@cmoussa1
Copy link
Member Author

Thanks! The Notes section was suggested by @grondo, and I'm really liking how the repo is turning out with the inclusion of additional documentation.

Made the changes to the Python code block in the Notes section, rebased, and force-pushed.

@dongahn dongahn merged commit 1b59062 into flux-framework:master Jan 21, 2020
@cmoussa1
Copy link
Member Author

Thanks!

@cmoussa1 cmoussa1 deleted the example2-update-core-v0.13 branch January 22, 2020 22:44
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants