Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in error handler since version 5.0.1 #12201

Closed
prj- opened this issue Dec 31, 2023 · 12 comments
Closed

Regression in error handler since version 5.0.1 #12201

prj- opened this issue Dec 31, 2023 · 12 comments
Assignees
Milestone

Comments

@prj-
Copy link

prj- commented Dec 31, 2023

When switching from version 5.0.0 to version 5.0.1., we noticed a regression in the error handler which outputs on the terminal some additional (unwanted) characters. See the discussion here: https://gitlab.com/petsc/petsc/-/merge_requests/7143#note_1709258522.

@prj- prj- changed the title Regression in error handler since Regression in error handler since version 5.0.1 Dec 31, 2023
@prj-
Copy link
Author

prj- commented Dec 31, 2023

@jsquyres, here is the bisection log.

git bisect start
# bad: [79f675e98ca842c7feb7615b292a202e3c0646ee] Merge pull request #12177 from wenduwan/5.0.1_release
git bisect bad 79f675e98ca842c7feb7615b292a202e3c0646ee
# good: [d0fe8ef8113906ac485c8d4a2e687d6bdcaa8305] update LICENCE file
git bisect good d0fe8ef8113906ac485c8d4a2e687d6bdcaa8305
# good: [d55bb67bd39423f0e4940915b1e906da2279e225] Merge pull request #12069 from jsquyres/pr/v5.0.x/still-moar-docs-updates
git bisect good d55bb67bd39423f0e4940915b1e906da2279e225
# good: [b34742d0131e4ed21c536e147f4efa65ced6deb4] osc_rdma: do pointer match on (const char*)
git bisect good b34742d0131e4ed21c536e147f4efa65ced6deb4
# good: [dcbea8518f78595eded77d8b29eda067e83feec7] Merge pull request #12139 from wenduwan/backport_ompi_info
git bisect good dcbea8518f78595eded77d8b29eda067e83feec7
# bad: [ca56ceb9365bd1f4beabcbf415e7e63b6dee365e] osc/rdma: fix compiler warnings in osc_rdma_accumulate.c
git bisect bad ca56ceb9365bd1f4beabcbf415e7e63b6dee365e
# bad: [15bfea8adefdb4deb3d1f328c9807b128720e27b] Merge pull request #12134 from jsquyres/pr/v5.0.x/update-prrte-and-corresponding-docs
git bisect bad 15bfea8adefdb4deb3d1f328c9807b128720e27b
# bad: [f53e5908c26cc876fc4a8e0e719c960505c15ccd] Update prrte submodule pointer to head of v3.0 branch
git bisect bad f53e5908c26cc876fc4a8e0e719c960505c15ccd
# good: [20e943685f40337c08a005dae289981338bdb5e8] docs: update to match PRRTE v3.0.3 docs
git bisect good 20e943685f40337c08a005dae289981338bdb5e8
# first bad commit: [f53e5908c26cc876fc4a8e0e719c960505c15ccd] Update prrte submodule pointer to head of v3.0 branch

@jsquyres jsquyres added this to the v5.0.2 milestone Jan 8, 2024
@jsquyres
Copy link
Member

jsquyres commented Jan 8, 2024

We talked about this on the RM call today -- we probably need an easy way to replicate this.

@prj- can you provide a simple recipe for replication?

@prj-
Copy link
Author

prj- commented Jan 8, 2024

There is one there: https://gitlab.com/petsc/petsc/-/merge_requests/7143#note_1709308747. Further below the thread, there is the script you'll need to replay the bisection. To just get the example running, the following should be enough.

$ git clone https://gitlab.com/petsc/petsc
$ cd petsc
$ export HASH=f53e5908c26cc876fc4a8e0e719c960505c15ccd PETSC_DIR=`pwd` PETSC_ARCH=arch-debug-openmpi
$ ./configure --with-debugging=1 --download-openmpi=git://https://github.com/open-mpi/ompi --download-openmpi-commit=`echo $HASH` --with-fortran-bindings-inplace
$ make all PETSC_ARCH=`echo ${PETSC_ARCH}` PETSC_DIR=`echo ${PETSC_DIR}`
$ cd src/sys/tests && make ex1f PETSC_ARCH=`echo ${PETSC_ARCH}` PETSC_DIR=`echo ${PETSC_DIR}`
$ ./ex1f >& log 
$ ${PETSC_DIR}/${PETSC_ARCH}/bin/mpiexec -n 1 ./ex1f >& log_bis
$ diff log log_bis

@rhc54
Copy link
Contributor

rhc54 commented Jan 9, 2024

I'm afraid using something that obtuse doesn't really help much - I need something a little more direct. Fortunately, I already have a test that just calls "abort" and can use that to reproduce the error message, which shows the extra character.

It'll take me a bit to chase it down.

@jsquyres
Copy link
Member

jsquyres commented Jan 9, 2024

@janjust volunteered to take a crack at replicating this issue.

@rhc54
Copy link
Contributor

rhc54 commented Jan 9, 2024

@janjust volunteered to take a crack at replicating this issue.

As noted above, I already have.

@rhc54
Copy link
Contributor

rhc54 commented Jan 10, 2024

Took a little bit to chase this down, but eventually figured it out. Fix is in referenced PR for PRRTE.

@jsquyres
Copy link
Member

Nice find -- thanks @rhc54!

@janjust
Copy link
Contributor

janjust commented Jan 18, 2024

fixed with ptr update #12237

@janjust janjust closed this as completed Jan 18, 2024
@prj-
Copy link
Author

prj- commented Feb 7, 2024

Our pipeline was clean with 5.0.2rc1, but is again failing with 5.0.2, sadly.

@prj-
Copy link
Author

prj- commented Feb 7, 2024

I guess the milestone should be changed from 5.0.2 to something else, 5.0.3 or 5.1.0?

@janjust
Copy link
Contributor

janjust commented Feb 7, 2024

@prj- yes, we expected it, we had a quick release schedule to fix a critical bug for an org so we regressed internal pointers. After today we will bump them again and your issue should be fixed.

@jsquyres jsquyres modified the milestones: v5.0.2, v5.0.3 Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants