Add GHA to copy completed notebooks #132

sjspielman · 2023-08-24T20:18:39Z

⚠️ Stacked on #129
Closes #111

This PR adds a github action that will copy completed notebooks from a given tag of training-modules into this template repository (presumably after someone has copied the template!). All repos involved are public, so no special security things here, which is important since we might expect folks running external workshops to use this action.

I developed this workflow in one of my personal repositories so I could spam myself to my heart's content, hence only one commit here. I confirmed it worked for any of the three workshops, but we should still do a final.final round of testing here as well. I figure we may want to get any conceptual reviews implemented first though, so this action currently only runs manually. Once we're confident that this is how the GHA should actually look, I can add a PR target so it will run here.

I made a variety of choices along the way which we can discuss during review if there are strongly differing opinions. Here's what I did!

The workflow takes two inputs, and the first one is conveniently handled with a dropdown menu:
- Which training module is being taught (this dictates which files to copy over). This defaults to the first choice item, which is a plain intro R workshop.
  - Note that I did not include bulk rnaseq here, but I can! I figured we are unlikely to teach it again, but that said there's nothing preventing external people from running one and it's a quick change to add in.
- The training-modules repo tag of interest, which defaults to master.
The workflow then clones both repos: training-modules and this one. You have to be a little careful with paths here, but it seems to work ok! I opted for this approach to find the files to copy rather than maintaining something analogous to an exercise_list like we do for the exercise copying action.
- Note that this repo gets cloned into a directory called main, which seemed like a good generic name for the repo we are PRing into (it won't always be training-specific-template, since this action will generally be run in a differently-named repo created from this template).
- I decided this for two reasons: I'd rather keep this all in 1 repo, and also it seemed like a fun thing to try out.
Since there are two repos here, I had to use --global flags for all the git config settings.
- If we don't want global username and email, I could alternatively cd main and then set local username/email if that is preferred.
Then, the workflow will make some arrays of which files need to be copied, and then they get curled into completed-notebooks/
- Noting here that I originally looked for .nb.html files, but we have one .html (not a notebook) living in intro scRNA-seq, so I went more permissive to just .html.
Finally, file the PR!

One key question for reviewers: Would we rather see the bash code pulled out into it's own script to live in scripts/? I was on the fence at the beginning, but now I'm trending towards "yes".

jashapiro

My main comment here is that if you already checked out both repos, why use curl? My suggested solution may not be quite what you want, but it should be along the right lines, and could simplify things.

I also had a naming suggestion because main is so overloaded.

Another simplifying thought is that maybe just a make an array of modules and loop through it with something like

for module in $modules; do
  cp training-modules/$module/[0-9]*.html website/completed-notebooks/$module/.
done

(I added some specificity back by starting with a number)

.github/workflows/copy-completed-notebooks.yml

jashapiro · 2023-08-24T21:19:26Z

.github/workflows/copy-completed-notebooks.yml

+          # the url for for this tag of the training-modules repo where files will be downloaded from
+          base_url=https://raw.githubusercontent.com/AlexsLemonade/training-modules/${{ github.event.inputs.training-modules-tag }}


Why are we using a URL here rather than just copying the files we already have checked out?

Why, indeed.

.github/workflows/copy-completed-notebooks.yml

jashapiro · 2023-08-24T21:30:12Z

.github/workflows/copy-completed-notebooks.yml

+            # note that at least one notebook is only .html, NOT .nb.html, so we'll just use .html
+            [[ $path != *.html ]] && echo "'$path' is not an .html file." && exit 1


Just noting that you used *html for the search, not *.html; it would probably be better to be consistent.

Now using [0-9]*.html throughout for copying, and we no longer have searching, because the point is to use both repos if we have both repos!

.github/workflows/copy-completed-notebooks.yml

sjspielman · 2023-08-25T13:02:24Z

My main comment here is that if you already checked out both repos, why use curl?

🤦‍♀️ is my answer I think...!

@5

- Call directory website, not main - Bump PR action to @5, remove safe directory creation - Stop curling, and start copying, because _of course_ - Add bulk RNAseq - Fancier PR comment - Use case statement to simplify the code while still only getting a given workshop's files

sjspielman · 2023-08-25T13:59:37Z

After my first round of excitingly using a different strategy from exercise notebooks, and then almost impressively forgetting I was taking that strategy half-way through writing 🤪, here is round 2!

Again, this was tested in a separate private repo of mine to save us all from spam, hence the single commit (which to be clear I just taught a workshop about how this was a bad practice, but here we are!).

I directly incorporated suggestions myself, including main -> website (thanks, good call), bumping the action to @5, etc. I also did simplify the code, but I decided in the end to use a case statement for this; I think this is much easier to read than the if statements. So, we still get notebooks for a given workshop only (including bulk).

Let me know what you think now, but again, no rush on re-reviewing this today if that's not good for your schedule!

jashapiro

Looks good, with a little suggestion below for future maintenance ease, I think.

Semi-related thought: As we are doing all this rejiggering, is it time to update the default branch to main for these repos?

jashapiro · 2023-08-25T14:11:57Z

.github/workflows/copy-completed-notebooks.yml

+
+      - name: Configure git
+        run: |
+          # Configure git for just the `website` repo local credentials


I had no problem with the global settings. It seems safer than changing directories.

jashapiro · 2023-08-25T14:22:33Z

.github/workflows/copy-completed-notebooks.yml

+          target_path=website/completed-notebooks/
+
+          # Copy notebooks over depending on which training module is being taught
+          case "${{ github.event.inputs.training-module }}" in


I had forgotten how ugly case statements in bash are, but I guess it kind of makes sense here.

But I did like my for loop idea to have only one cp statement.

case "${{ github.event.inputs.training-module }}" in "Introduction to R and Tidyverse") modules=("intro-to-R-tidyverse") ;; "Introduction to single-cell RNA-seq") modules=("intro-to-R-tidyverse" "scRNA-seq") ;;

and so on...

followed by (syntax correct this time)

for module in ${modules[@]} do cp training-modules/${module}/[0-9]*.html ${target_path} done

Ahh I see, I like this!

sjspielman · 2023-08-25T14:32:18Z

Semi-related thought: As we are doing all this rejiggering, is it time to update the default branch to main for these repos?

I had thought about this at one point, but got nervous about finding all the links in all the places since it's spread across several repos. As long as we don't delete master (which I don't think you're suggesting anyways!), then previously-used links should work just fine I suppose. I might call this a "nice to have but let's circle back towards the end of this epic and see what the mess looks like."

jashapiro · 2023-08-25T14:34:18Z

Semi-related thought: As we are doing all this rejiggering, is it time to update the default branch to main for these repos?

I had thought about this at one point, but got nervous about finding all the links in all the places since it's spread across several repos. As long as we don't delete master (which I don't think you're suggesting anyways!), then previously-used links should work just fine I suppose. I might call this a "nice to have but let's circle back towards the end of this epic and see what the mess looks like."

Yeah, I guess I was thinking that this repo might be the one to start with? Nothing should link back to it that I am aware of...

…cp line. Tested locally as script and syntax is good.

sjspielman · 2023-09-07T13:21:36Z

@jashapiro was there anything else here? I think this was pretty much set?

jashapiro

Just a couple more little changes (which I caught better because of the simplified structure, so that's a win).

.github/workflows/copy-completed-notebooks.yml

Co-authored-by: Joshua Shapiro <[email protected]>

jashapiro

⚡

Add tested workflow for copying completed notebooks

0cde52f

sjspielman requested a review from jashapiro August 24, 2023 20:18

jashapiro reviewed Aug 24, 2023

View reviewed changes

jashapiro reviewed Aug 25, 2023

View reviewed changes

.github/workflows/copy-completed-notebooks.yml Outdated Show resolved Hide resolved

sjspielman requested a review from jashapiro August 25, 2023 14:13

jashapiro reviewed Aug 25, 2023

View reviewed changes

sjspielman mentioned this pull request Aug 25, 2023

Consider changing default branch to main AlexsLemonade/training-modules#750

Open

sjspielman added 2 commits August 25, 2023 14:08

global config

089cf88

use case block to set up module paths, then loop over to copy in one …

edb4008

…cp line. Tested locally as script and syntax is good.

sjspielman requested a review from jashapiro September 7, 2023 13:21

jashapiro reviewed Sep 7, 2023

View reviewed changes

.github/workflows/copy-completed-notebooks.yml Outdated Show resolved Hide resolved

.github/workflows/copy-completed-notebooks.yml Outdated Show resolved Hide resolved

Apply suggestions from code review

d3de31f

Co-authored-by: Joshua Shapiro <[email protected]>

sjspielman requested a review from jashapiro September 7, 2023 13:32

jashapiro approved these changes Sep 7, 2023

View reviewed changes

sjspielman merged commit 78a48a8 into sjspielman/122-workshop-resources Sep 12, 2023

sjspielman deleted the sjspielman/111-copy-completed-notebooks-gha branch September 12, 2023 15:00

sjspielman mentioned this pull request Sep 12, 2023

Add GHA to copy completed notebooks #111

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GHA to copy completed notebooks #132

Add GHA to copy completed notebooks #132

sjspielman commented Aug 24, 2023

jashapiro left a comment

jashapiro Aug 24, 2023

sjspielman Aug 25, 2023

jashapiro Aug 24, 2023

sjspielman Aug 25, 2023

sjspielman commented Aug 25, 2023

sjspielman commented Aug 25, 2023

jashapiro left a comment

jashapiro Aug 25, 2023

jashapiro Aug 25, 2023

sjspielman Aug 25, 2023

sjspielman commented Aug 25, 2023

jashapiro commented Aug 25, 2023

sjspielman commented Sep 7, 2023

jashapiro left a comment

jashapiro left a comment

		# the url for for this tag of the training-modules repo where files will be downloaded from
		base_url=https://raw.githubusercontent.com/AlexsLemonade/training-modules/${{ github.event.inputs.training-modules-tag }}

		# note that at least one notebook is only .html, NOT .nb.html, so we'll just use .html
		[[ $path != *.html ]] && echo "'$path' is not an .html file." && exit 1

Add GHA to copy completed notebooks #132

Add GHA to copy completed notebooks #132

Conversation

sjspielman commented Aug 24, 2023

jashapiro left a comment

Choose a reason for hiding this comment

jashapiro Aug 24, 2023

Choose a reason for hiding this comment

sjspielman Aug 25, 2023

Choose a reason for hiding this comment

jashapiro Aug 24, 2023

Choose a reason for hiding this comment

sjspielman Aug 25, 2023

Choose a reason for hiding this comment

sjspielman commented Aug 25, 2023

sjspielman commented Aug 25, 2023

jashapiro left a comment

Choose a reason for hiding this comment

jashapiro Aug 25, 2023

Choose a reason for hiding this comment

jashapiro Aug 25, 2023

Choose a reason for hiding this comment

sjspielman Aug 25, 2023

Choose a reason for hiding this comment

sjspielman commented Aug 25, 2023

jashapiro commented Aug 25, 2023

sjspielman commented Sep 7, 2023

jashapiro left a comment

Choose a reason for hiding this comment

jashapiro left a comment

Choose a reason for hiding this comment