Skip to content

Code Cleanup Procedure

Shintaro Iwasaki edited this page Jul 7, 2020 · 6 revisions

This is a guide for fixing your branches after the large code cleanup patch that was merged into MPICH.

Warnings

This is going to tell you to use the most dangerous command in all of Git: git filter-branch. This command can wreck your Git metadata if you do it wrong, so I'd highly recommend that you make a copy of your local Git repository and work there, instead of your usual working directory.

You will also want to closely verify the results of this work to make sure they were successful.

TL;DR

You'll need to go through this process for each of your branches, so here's the short version of the commands. Details after this section. Also, don't forget to set up the hooks to avoid problems in the future at the bottom of the page.

git rebase -i 596c4f47efea
  • Rebase onto 596c4f47efea
touch DELETE_ME && git add DELETE_ME && git commit -m 'DELETE ME'
git rebase -i 596c4f47efea
  • Move your dummy commit to the beginning of your branch
git filter-branch -f --tree-filter "maint/code-cleanup.sh --all --recursive" -- 596c4f47efea..HEAD
git rebase -i mpich/main
  • Remove the dummy commit from before
  • Resolve conflicts

Outline

This procedure is going to happen in a two basic steps (with lots of smaller steps):

  1. Fix all of the whitespace in your existing branch
  2. Rebase your branch onto the latest main branch.

Prerequisites

This entire procedure is a lot easier if you rebase your branch to the commit before the whitespace cleanup (596c4f47efea). This is because there are a few commits just before that patch that help the code cleanup script generate good output. You will need to resolve any conflicts to get to this point on your own.

Fix Whitespace in Your Current Branch

After you've rebased your branch onto the commit before the whitespace cleanup (596c4f47efea), you are ready to cleanup your existing branch.

  1. Create a dummy commit to "absorb" the whitespace changes.

touch DELETE_ME && git add DELETE_ME && git commit -m 'DELETE ME'

This commit will "absorb" all of the whitespace changes and be removed at the end of the process. For now, we're going to stick it at the beginning of your branch.

  1. Move the dummy commit to the beginning of your branch.

git rebase -i 596c4f47efea

If you're not familiar with interactive rebasing, I'm not going to cover it all here. For now, what you need to know is that you need to move the last commit to be the first commit. So remove the last line in the list of patches you see in your editor and move it to the beginning of the list of patches. So you should now see something like this:

pick a1b2c3d4 DELETE ME
pick e5f6g7h8 Your next patch
...more patches here...

# Rebase 12345678..23456798 onto 596c4f47efea
#
# Commands:
# p, pick = use commit
# r, reword = use commit, but edit the commit message
# e, edit = use commit, but stop for amending
# s, squash = use commit, but meld into previous commit
# f, fixup = like "squash", but discard this commit's log message
# x, exec = run command (the rest of the line) using shell
# d, drop = remove commit
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#
# Note that empty commits are commented out

Save and close the file and let Git do it's work. Now if you look at the log (or git graph if you have that alias set up), you should see DELETE ME just before all of your patches, which are sitting on top of 596c4f47efea mpi: Add brackets to help indent script.

  1. Run the command to clean up the whitespace

git filter-branch -f --tree-filter "maint/code-cleanup.sh --all --recursive" -- 596c4f47efea..HEAD

This is the big command that's going to take a while. Go get lunch/dinner/drinks/watch a movie. This will take about 3-4 minutes per commit in your branch. This will run the code cleanup script on every patch in your branch without resolving merge conflicts. This is important because otherwise every patch would conflict with the one before it.

There will be a lot of output for each patch that looks like this:

Rewrite 89a8485e559013977dba3f7a4379391acb407274 (3/3) (226 seconds passed, remaining 0 predicted)    indent: src/mpi/debugger/tvtest.c:34: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/ad_gpfs/ad_gpfs_rdcoll.c:519: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/ad_gpfs/ad_gpfs_rdcoll.c:519: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/ad_gpfs/ad_gpfs_wrcoll.c:651: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/ad_gpfs/ad_gpfs_wrcoll.c:651: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/ad_lustre/ad_lustre_hints.c:22: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/ad_lustre/ad_lustre_open.c:24: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/ad_lustre/ad_lustre_rwcontig.c:69: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/ad_lustre/ad_lustre_rwcontig.c:143: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/ad_nfs/ad_nfs_iwrite.c:59: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/ad_nfs/ad_nfs_read.c:18: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/ad_nfs/ad_nfs_write.c:18: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/ad_nfs/ad_nfs_write.c:100: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/ad_xfs/ad_xfs_read.c:23: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/ad_xfs/ad_xfs_write.c:24: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/common/ad_iwrite.c:88: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/common/ad_read_coll.c:515: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/common/ad_read_coll.c:515: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/common/ad_write_coll.c:315: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/common/ad_write_coll.c:315: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/common/ad_write_nolock.c:31: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/common/ad_write_nolock.c:32: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/common/system_hints.c:63: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/adio/common/system_hints.c:143: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/mpi/romio/test/big_extents.c:49: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/pm/util/cmnargs.c:107: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/pm/util/cmnargs.c:114: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

indent: src/pm/util/process.h:61: Warning:old style assignment ambiguity in "=-".  Assuming "= -"

When you're done, your branch will be converted to have all of the formatting compliant with the code cleanup script. You will also have an old reference sitting around that git filter-branch keeps as a backup called refs/original/refs/heads/<your_branch_here>. Ignore that one for now (you can clean it up later) and let's focus on the new one. At this point, you can check your branch to make sure everything looks like you think it should.

Time to move to phase two...

Rebase your branch onto the latest main branch

Now that the whitespace is cleaned up in your branch, the number of conflicts your branch will have when rebased back onto main will be much less. I say less and not none because you will still see two types of conflicts. 1) Normal conflicts for anything that has changed on main since the initial code cleanup patch, and 2) Whitespace conflicts with manual fixes done in the code cleanup patch. The code cleanup patch was more than just running the code cleanup script. It also contained some manual whitespace fixes that the code cleanup script missed and would only need to be done once. If you conflict with these changes (mostly inside macros, but also some other places), you may still have to fix those manually. Unfortunately there isn't a shortcut for that, but the number of places that comes up is much lower.

  1. Interactive rebase onto main and drop the dummy patch

git rebase -i mpich/main

For this command, I assume that your local name for https://github.com/pmodels/mpich.git is mpich. If that's not what you call it, change mpich to your name (for me, it's actually argonne). When you do this, you'll want to drop the dummy patch we created earlier. To do that, just delete that line from the editor that pops up when you run the command so your editor should only show patches that were originally in your branch.

  1. Resolve conflicts

It's possible that you'll run into conflicts during this rebase. Fix them as they come up. If you don't know how to fix rebase conflicts, use git status for hints and ask a friend/the Internet.

  1. Celebrate

At this point, your branch should look correct. Your branch should not contain a bunch of formatting fixes, but should look pretty much like they did before. Make sure all of that is true before pushing your branch to any remote and overwriting previous work.

Keep From Screwing Up the Future

There are two things you can to do make sure things don't end up looking bad again in the future.

Add a pre-commit hook to your local repository

Pre commit hooks are scripts that get run whenever you create a commit. This hook will make sure you don't add bad formatting or whitespace and if you do, it will tell you about it so you can fix it. It will not fix it for you.

Use the hook in the MPICH tree: https://github.com/pmodels/mpich/blob/main/maint/hooks/pre-commit

Save the file into your MPICH source directory under .git/hooks/pre-commit. From now on, whenever you create a commit, it will be run. If, for some reason, you really need to ignore it, you can do so by adding -n to your git commit command.

Add a hook to your Jenkins/GitHub CI testing

If you're running a Jenkins server, this is the hook you'll need to make check for bad formatting. I'll assume you know how to do that: