-
Notifications
You must be signed in to change notification settings - Fork 866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPI_Pack_external_size is returning the wrong values in 64-bit applications #10
Comments
Imported from trac issue 628. Created by rolfv on 2006-11-22T11:16:34, last modified: 2010-02-19T09:49:18
|
Trac comment by rolfv on 2008-06-25 09:01:18: I just retested this with the latest trunk and this still fails for us. |
Trac comment by rolfv on 2008-09-09 14:39:08: I just retested again with the trunk and with 1.3 and it still fails. It passes for 32-bits. Here are the 1.3 results (r19400).
|
Trac comment by bosilca on 2008-09-17 10:05:17: I did find some time to look at this one. And it's really funny ... I guess that your build of Open MPI do not support HETEROGENEOUS systems. Unfortunately, there are a lot of shortcuts when we know we are in a homogeneous environment, and one of those completely ignore the extern32 type (on 64 bits machines as they are different) when we create communicators ... I'll try to figure out a nice solution, as even in heterogeneous systems the pack_extern32 should work as expected. |
Trac comment by jsquyres on 2010-01-25 20:08:18: Sun -- is this still happening (i.e., even after the changes in the DDT engine within the last few months)? |
Trac comment by rolfv on 2010-01-26 16:59:55: Yes, this still happens. I tried it on Solaris SPARC and a Linux machine using the Sun Studio compilers in both cases. Here is the Solaris SPARC failure.
|
Trac comment by jsquyres on 2010-01-26 17:18:07: George -- any hope of this getting fixed? |
Trac comment by bosilca on 2010-02-08 15:43:17: The root of the problem here is that if heterogeneous support is not enabled at build time, Open MPI does not support anything else than homogeneous environments. In this particular instance (i.e. 64 bits build and external representation), we are supposed to do a conversion from the local type into the external one. As heterogeneous support is disabled, we report the same size ... |
Trac comment by jsquyres on 2010-02-08 15:59:54: Ah, fair enough. Should we just raise an MPI exception in this case (MPI_ERR_UNSUPPORTED or somesuch)? |
Trac comment by rolfv on 2010-02-08 16:16:38: I am not sure I follow all this. We configure our build with heterogeneous enabled.
I also tried with the mca flag to force heterogeneous.
|
Trac comment by rusraink on 2010-02-19 09:49:18: After inspecting and going through |
OSHMEM: fix memheap/sshmem/mmap to use MAP_PRIVATE instead of SHARED to speedup registration RM-approved
…ream sync with ompi-release/v1.8
Addressed pull-request comments from jfsquyres->
few years old and no replicator. |
add definition of MPI_MAX_PSET_NAME_LEN
F08 and PMPI for the ftmpi bindings Approved-by: Aurelien Bouteiller <[email protected]>
The historical repository with full history and attribution is available at https://bitbucket.org/icldistcomp/ulfm2/src/ulfm/. Squashed commit of the following: commit 73b6fa48c8af40bfa28e24f6c79176a254c449be Author: Aurelien Bouteiller <[email protected]> Date: Thu May 14 19:21:20 2020 -0400 Typo in comment for non-blocking error check Signed-off-by: Aurelien Bouteiller <[email protected]> commit 3a9fd329e35564af826c81aae18d4df4eebbd275 Author: Aurelien Bouteiller <[email protected]> Date: Thu May 14 19:19:08 2020 -0400 Do not iface_check in non-blocking and never set MPI_ERROR in single status functions Signed-off-by: Aurelien Bouteiller <[email protected]> commit a9913a4777e0d7d78ff9ead0a51e807316f01d2f Author: Aurelien Bouteiller <[email protected]> Date: Thu May 14 18:12:08 2020 -0400 Remove iface_create_check on intercomm creations commit 99ea1398127c51ada0179ab1737f2134ee0de8ff Author: Aurelien Bouteiller <[email protected]> Date: Thu May 14 17:43:41 2020 -0400 Update README to denote supported/unsupported components and default settings commit 59110aa35fa465cddf65e2937066928e45a685c0 Author: Aurelien Bouteiller <[email protected]> Date: Thu May 14 13:22:14 2020 -0400 Do not disable compile-time components with_ft is on by default at configure time enable_ft is off by default at runtime have a --tune file to control the behavior of loaded components disable runtime loading of MTL and PML components and hcoll when FT is on. commit 66566b63f1dd9eae633d57c1f3cca57c78978a22 Author: Aurelien Bouteiller <[email protected]> Date: Wed May 13 04:34:50 2020 -0400 Correct error path in comm_spawn Signed-off-by: Aurelien Bouteiller <[email protected]> commit 9a5cb3cb79ab4321a14425a422f68d336b4681ab Author: Aurelien Bouteiller <[email protected]> Date: Wed May 13 02:39:43 2020 -0400 Remove extra ompi_request_t fields (tag, peer, any_src_pending) Signed-off-by: Aurelien Bouteiller <[email protected]> commit b1dda7c8d51c66f10dadcea676d7e5622b549a18 Author: Aurelien Bouteiller <[email protected]> Date: Tue May 12 14:32:39 2020 -0400 Cleanup ftagree (FAILURE_PROB) Signed-off-by: Aurelien Bouteiller <[email protected]> commit c05bf3ef14ac8d5b55f936bd2ff7680575a1d019 Author: Aurelien Bouteiller <[email protected]> Date: Tue May 12 12:27:30 2020 -0400 Remove the need to modify every coll component to add agree Rename coll_agreement to coll_agree (to match existing practice of matching the MPI name) Signed-off-by: Aurelien Bouteiller <[email protected]> Copyright cleanup in unchanged files Signed-off-by: Aurelien Bouteiller <[email protected]> commit cf0461886a9318ac0b87c73f2c2a1868b9481be6 Author: Aurelien Bouteiller <[email protected]> Date: Tue May 12 02:30:27 2020 -0400 Copyright cleanup Signed-off-by: Aurelien Bouteiller <[email protected]> commit 61eb3b3163011769a020d2a714085380e8b6d8b3 Author: Aurelien Bouteiller <[email protected]> Date: Tue May 12 01:46:39 2020 -0400 Round 1 of review comments commit 64d956017415bf40397a12f039e62211e57c5c56 Author: Aurelien Bouteiller <[email protected]> Date: Mon May 11 00:34:23 2020 -0400 Revert changes to version and README for standalone ULFM packaging. Signed-off-by: Aurelien Bouteiller <[email protected]> commit cd5c5ed41b3dcd4162632c47ac500daf4cc5216f Author: Aurelien Bouteiller <[email protected]> Date: Sun May 10 23:58:33 2020 -0400 Revert "Restore ulfm specific changes to openib btl cancelled by merge 4ce1669a" This reverts commit f2b7da5d488f1b1d27c6a8643128a10eadd86f67. Revert "Revert "platform: Remove "with_verbs" from all the platform files."" This reverts commit 74d9c41e32e5b0c7fdb720156091a1eb49c03537. Revert "Revert "README: Remove all references to --with-verbs[*]"" This reverts commit 385dbd0dad512245e9197af98244ac970f3d956e. Revert "Revert "opal/common: remove stale common components"" This reverts commit 0c3a306c695eb12d489b9fdbfa4ec6262935e7c1. Revert "Revert "m4: remove all configury related to libibverbs"" This reverts commit f8f1b8537fd929a4fc1432936a71d7f2def41bbd. Revert "Revert "btl/openib: So long / farewell / it's time to say goodnight"" This reverts commit 4a82cca865ac043e8aab75356ed78786115b52ef. commit f627b1c53de171dd6551e8b00fb5907715364939 Merge: fb3507a1 9996b9f5 Author: Aurelien Bouteiller <[email protected]> Date: Thu May 14 21:20:59 2020 -0400 Merge branch 'master' into ulfm-prrte commit fb3507a19183fe4293dad1d0d432641a11640a89 Merge: 0823ee3e 0dc23252 Author: Aurelien Bouteiller <[email protected]> Date: Sun May 10 23:07:42 2020 -0400 Merge branch 'master' into ulfm (orte removal) commit 0823ee3e57d24d11ee1c8ba232c601707645a7a8 Author: Aurelien Bouteiller <[email protected]> Date: Fri Jan 31 16:26:00 2020 -0500 An error in readme about Agree: it does a AND Signed-off-by: Aurelien Bouteiller <[email protected]> commit 322684e42d99e28964678c9f54a0de570dd47f39 Author: Aurelien Bouteiller <[email protected]> Date: Fri Jan 31 15:02:23 2020 -0500 Change verbosity in agree to help track split-decision bugs Signed-off-by: Aurelien Bouteiller <[email protected]> commit 67ca89e04d4b5452fc5871d823e00ae5f6e247bb Merge: d4ff45bd cf4398e2 Author: Aurelien Bouteiller <[email protected]> Date: Fri Jan 31 18:54:20 2020 +0000 Merged in abouteiller/ulfm2/bugfix/era_thread_safe2 (pull request #21) Thread safe access to era_incomplete_msg and passed_agreement hash-tables Approved-by: Aurelien Bouteiller <[email protected]> Approved-by: George Bosilca <[email protected]> commit cf4398e2a0431386b2216ae73e4251c0978143bc Author: Aurelien Bouteiller <[email protected]> Date: Thu Jan 30 15:38:08 2020 -0500 Thread safe access to era_incomplete_msg and passed_agreement hash-tables Signed-off-by: Aurelien Bouteiller <[email protected]> commit d4ff45bdf4aad071d3f1abddda9ac3576a83741e Merge: cdd2f6b4 12757660 Author: George Bosilca <[email protected]> Date: Thu Jan 30 18:47:34 2020 -0500 Merge remote-tracking branch 'upstream/master' into ulfm Signed-off-by: George Bosilca <[email protected]> Conflicts: ompi/include/mpif-values.pl ompi/mca/coll/libnbc/nbc.c ompi/mca/pml/ob1/pml_ob1.c ompi/tools/ompi_info/param.c opal/mca/btl/tcp/btl_tcp_endpoint.c opal/mca/btl/tcp/btl_tcp_frag.c opal/mca/hwloc/hwloc2/configure.m4 orte/mca/odls/base/odls_base_default_fns.c commit cdd2f6b43961857cf4c84c27de608c7462e37919 Author: Aurélien Bouteiller <[email protected]> Date: Thu Jan 30 14:14:45 2020 -0500 Update VERSION to the new numbering scheme :v4.1.0u1a1: alpha 1 of the first release of ULFM based on (unreleased, devel) v4.1.0 Signed-off-by: Aurélien Bouteiller <[email protected]> commit b8da0edf73b446cc2aa59f0f86b48c925d3add37 Merge: e5c6c5e6 c6ade8fa Author: Aurelien Bouteiller <[email protected]> Date: Thu Jan 30 04:32:15 2020 +0000 Merged in abouteiller/ulfm2/bugfix/concurrent-tcp-close (pull request #16) Do not close the socket meanwhile the opal_progress loop is adding events to the event base commit e5c6c5e6f240260514e08e130177e7f86f2246ee Merge: c2212cb0 227a6779 Author: Aurelien Bouteiller <[email protected]> Date: Thu Jan 30 04:31:29 2020 +0000 Merged in abouteiller/ulfm2/bugfix/openib-noproc-error (pull request #20) An error without an errproc is always promoted to fatal, which causes pandemic failures when openIB credits to a dead peer exhaust. Approved-by: George Bosilca <[email protected]> commit c2212cb0fd4ed8a54b36f02f9cb234cd1df2ac69 Merge: 43c1d324 2510df24 Author: Aurelien Bouteiller <[email protected]> Date: Thu Jan 30 04:30:42 2020 +0000 Merged in abouteiller/ulfm2/bugfix/recursive-era-mark-failed (pull request #19) Resolve recursive and multithreaded access to the era Approved-by: George Bosilca <[email protected]> commit 2510df24a73ba5a563537e0c44b6249f163679cd Author: Aurelien Bouteiller <[email protected]> Date: Wed Jan 22 15:33:48 2020 -0500 Resolve recursive and multithreaded access to the era_parent and next_child functions causing inconsistent agreements Signed-off-by: Aurelien Bouteiller <[email protected]> commit 227a67797859e176336a7033b1bf9cb0f94584c7 Author: Aurelien Bouteiller <[email protected]> Date: Mon May 20 12:08:18 2019 -0400 An error without an errproc is always promoted to fatal, which causes pandemic failures when openIB credits to a dead peer exhaust. Signed-off-by: Aurelien Bouteiller <[email protected]> commit 43c1d32448e64ff2bd322b206d82b27e75033fd8 Merge: cf8dc43f a36f138a Author: Aurelien Bouteiller <[email protected]> Date: Wed Jan 22 20:55:33 2020 +0000 Merged in bugfix/sync-mt-waitall-any-some (pull request #18) bug fix SYNC_WAIT with threads in WAITALL and friends commit cf8dc43f907353b40b42aaf7318e05b49e7243a5 Author: Aurelien Bouteiller <[email protected]> Date: Sun Nov 17 12:00:54 2019 -0500 Close the detector before removing the bsend system, but after deleting Self attr Signed-off-by: Aurelien Bouteiller <[email protected]> commit 6e386e4d66288f68e8c12a81ede76b7cceb86471 Author: Aurelien Bouteiller <[email protected]> Date: Sat Nov 16 22:13:05 2019 -0500 Cleanup asserts and add some more debug messages Signed-off-by: Aurelien Bouteiller <[email protected]> commit 593db6aca8dd89997eb6787cad409778e11ef0b8 Author: Aurelien Bouteiller <[email protected]> Date: Sat Nov 16 22:08:50 2019 -0500 Return a revoke error only when comm is revoked commit a36f138a911a457fae57366bbbb501eb1efe77ee Author: Aurelien Bouteiller <[email protected]> Date: Wed Nov 13 17:36:30 2019 -0500 Fix a case were the SYNC_WAIT would be rearmed while it was unsafe w.r.t. a progress thread, and cases were the SYNC would be released before being SIGNALED. Signed-off-by: Aurelien Bouteiller <[email protected]> commit 791214b118570df301c6cbe47ad291a54bc21ab8 Author: Aurelien Bouteiller <[email protected]> Date: Wed Nov 13 17:25:14 2019 -0500 Be more verbose about having a progress thread in the detector. Signed-off-by: Aurelien Bouteiller <[email protected]> commit 0e249ca1ae5cb27a3f3d907173b65db188380ce5 Author: Aurelien Bouteiller <[email protected]> Date: Tue Nov 12 17:28:57 2019 -0500 Remove the pending event when socket is TCP_FAILED Signed-off-by: Aurelien Bouteiller <[email protected]> commit f363e250686cc299631fa26a2cc92e3f2dc9e5d6 Author: Aurelien Bouteiller <[email protected]> Date: Tue Nov 12 17:20:52 2019 -0500 Fix a set of issues with Agree commit c7473b5d227a74f28a7fa4a6019f498e06d20b34 Merge: 897b87a0 88c18329 Author: Aurelien Bouteiller <[email protected]> Date: Mon Nov 11 19:54:55 2019 +0000 Merged in abouteiller/ulfm2/sanity/dont-mark-myself-failed (pull request #15) Do not mark myself as failed, this is never normal Approved-by: Aurelien Bouteiller <[email protected]> commit 88c18329e525ad7cf5648c10e20a59add0073c11 Author: Aurelien Bouteiller <[email protected]> Date: Fri Nov 8 16:16:30 2019 -0500 Do not mark myself as failed, this is never normal Signed-off-by: Aurelien Bouteiller <[email protected]> commit c6ade8fa34d8545f17afde35eda67ab4ceedc3f2 Author: Aurelien Bouteiller <[email protected]> Date: Wed Nov 6 14:01:01 2019 -0500 Do not close the socket meanwhile the opal_progress loop is adding events to the event base Signed-off-by: Aurelien Bouteiller <[email protected]> commit 897b87a0d680c3604756309ef78c368675eb884c Merge: 94391d9e 82c9b479 Author: Aurelien Bouteiller <[email protected]> Date: Mon Nov 11 19:20:07 2019 +0000 Merged in abouteiller/ulfm2/bugfix/mt-sync-revoked (pull request #17) Bugfix/mt sync revoked Approved-by: George Bosilca <[email protected]> commit 82c9b479ed4656696e3a1217405847c68ddc2575 Author: Aurelien Bouteiller <[email protected]> Date: Fri Nov 8 18:03:26 2019 -0500 Do not add more requests to the matching queue after the comm is revoked Signed-off-by: Aurelien Bouteiller <[email protected]> commit 4b01a5764869dff4f922903283053784f5a42301 Author: Aurelien Bouteiller <[email protected]> Date: Fri Nov 8 17:56:37 2019 -0500 Bugfix: we need to check if the request if ok before entering the first waitsync_mt Signed-off-by: Aurelien Bouteiller <[email protected]> commit 94391d9e38ad53ce55bc2764ed910b329ef4b92f Merge: eb275c65 f7b5b637 Author: Aurelien Bouteiller <[email protected]> Date: Tue Nov 5 14:05:08 2019 +0000 Merged in abouteiller/ulfm2/bugfix/fd-drift (pull request #14) reduce the sensitivity fo the detector to noise and drift Approved-by: George Bosilca <[email protected]> commit f7b5b63763b974cf06372645cff3e044a4a53165 Author: Aurélien Bouteiller <[email protected]> Date: Mon Nov 4 10:58:06 2019 -0500 reduce the sensitivity fo the detector to noise and drift Signed-off-by: Aurélien Bouteiller <[email protected]> commit eb275c655dee7ee7d18fe24004a3d37bfd25a8c2 Author: Aurélien Bouteiller <[email protected]> Date: Fri Oct 18 17:02:09 2019 -0400 Document why an assert may trigger in false-detection scenarios commit 52c2a5d710f80c0d26bf1cd7c42f7cbd58cc1e24 Author: Aurélien Bouteiller <[email protected]> Date: Wed Oct 16 15:06:02 2019 -0400 Use the correct option to force internal pmix/event commit bc69fd1bd1acf3a778b11c13f667ca3b972f1610 Author: Aurélien Bouteiller <[email protected]> Date: Tue Oct 15 14:44:20 2019 -0400 We have modifications in pmix and libevent, prefer the internal ones commit 617e2b4c9ce27c24d3c8eb6c8aa539884904a65c Author: Aurélien Bouteiller <[email protected]> Date: Tue Oct 15 14:43:16 2019 -0400 Bugfix a case where the FD would keep observing a dead process forever if reported from inline (rather than by the detector itself) commit b54585d832588258277ea4d16d519c6a46439260 Author: Nuria Losada <[email protected]> Date: Tue Aug 6 10:18:52 2019 -0400 Avoid cleanup of job_session_dir and orte proc_session_dir upon application process failure commit f8d536027988500abb87adc22fa147be6d3eda7e Author: Aurelien Bouteiller <[email protected]> Date: Thu Jul 11 14:44:05 2019 -0400 Cleanup rdma_frags and registrations in revoked/error sendreqs Free up rdma_frag in sendreqs when the request is cancelled in error or revoked. Return registrations for cancelled/revoked sendreqs Remove dead/useless code commit 6c76e287178d42d7dfd1e50e6be4ba18a86a06a1 Author: Aurelien Bouteiller <[email protected]> Date: Thu Jun 13 04:19:44 2019 -0400 Missing semicolon appears only when fotran logical needs conversion commit 92e108f9ae1e4ffb129086ada8d4a7643ee8c708 Author: Aurelien Bouteiller <[email protected]> Date: Thu Jun 13 03:29:28 2019 -0400 A bug in PMIx disables node-local detection, use the OMPI detector instead commit 4dcf700e1a49479d1df4693b32cdc5cd187ec056 Author: Aurelien Bouteiller <[email protected]> Date: Fri May 24 14:28:15 2019 -0400 Do not send rbcast to known dead processes to avoid paying the send-detection penalty Signed-off-by: Aurelien Bouteiller <[email protected]> commit 6f375bff8e2d893343064e51bc01b6806d166d1c Author: Aurelien Bouteiller <[email protected]> Date: Wed May 22 13:59:22 2019 -0400 When receiving a wrong heartbeat, ignore it rather than rearming Signed-off-by: Aurelien Bouteiller <[email protected]> commit 027afa741bf99481e7b1c2ad66579fd611190489 Merge: 08122763 b7806672 Author: Aurelien Bouteiller <[email protected]> Date: Tue May 21 17:32:53 2019 -0400 Merge branch 'master' into ulfm commit 081227637a652b7b82103697c0b7c353ad58e220 Merge: 6f002936 aa5e5a65 Author: Aurelien Bouteiller <[email protected]> Date: Tue Apr 23 20:57:35 2019 +0000 Merged in abouteiller/ulfm2/merge/postopenib (pull request #12) Merge/postopenib Approved-by: Aurelien Bouteiller <[email protected]> commit aa5e5a65e4e02931b6239749a1d1671bd407f655 Author: Aurelien Bouteiller <[email protected]> Date: Wed Apr 3 17:33:32 2019 -0400 Let errors flow through spawn/connect accept in order to make sure we do not end-up in unmatched mpi calls in error cases Signed-off-by: Aurelien Bouteiller <[email protected]> commit edf0086d55ad955b26336bd96d131482dbb88ef4 Merge: 0fe172d9 97b7fab8 Author: Aurelien Bouteiller <[email protected]> Date: Mon Mar 25 11:14:23 2019 -0400 Merge branch 'master' into merge/postopenib commit 0fe172d9bf5cf7e9f82c951004ce32ffd8cc2955 Merge: f2b7da5d 53cd31ed Author: Aurelien Bouteiller <[email protected]> Date: Fri Mar 22 00:46:18 2019 -0400 Merge branch 'master' into merge/postopenib commit f2b7da5d488f1b1d27c6a8643128a10eadd86f67 Author: Aurelien Bouteiller <[email protected]> Date: Wed Mar 13 15:08:06 2019 -0400 Restore ulfm specific changes to openib btl cancelled by merge 4ce1669a Signed-off-by: Aurelien Bouteiller <[email protected]> commit 74d9c41e32e5b0c7fdb720156091a1eb49c03537 Author: Aurelien Bouteiller <[email protected]> Date: Wed Mar 13 14:57:55 2019 -0400 Revert "platform: Remove "with_verbs" from all the platform files." This reverts commit 99553eb1b9b2a6300525e06114b38c1c091f23e8. commit 385dbd0dad512245e9197af98244ac970f3d956e Author: Aurelien Bouteiller <[email protected]> Date: Wed Mar 13 14:57:47 2019 -0400 Revert "README: Remove all references to --with-verbs[*]" This reverts commit 48a33ee6db06df1426d3ab9fa4adb2c6d182f8d3. commit 0c3a306c695eb12d489b9fdbfa4ec6262935e7c1 Author: Aurelien Bouteiller <[email protected]> Date: Wed Mar 13 15:25:21 2019 -0400 Revert "opal/common: remove stale common components" This reverts commit 3f4af8e51ca70f7ca0e46b734f3e11e513b858dc. commit f8f1b8537fd929a4fc1432936a71d7f2def41bbd Author: Aurelien Bouteiller <[email protected]> Date: Wed Mar 13 14:56:52 2019 -0400 Revert "m4: remove all configury related to libibverbs" This reverts commit 59c8ab6da4276ff398453a54910c6c0fb67a153c. commit 4a82cca865ac043e8aab75356ed78786115b52ef Author: Aurelien Bouteiller <[email protected]> Date: Wed Mar 13 14:56:10 2019 -0400 Revert "btl/openib: So long / farewell / it's time to say goodnight" This reverts commit 8de786f5a40ab96069b9c661d6ea8bb892688cac. commit 4ce1669a7463280528473eeb69e59dc360f75a31 Merge: 6f002936 01737960 Author: Aurelien Bouteiller <[email protected]> Date: Wed Mar 13 14:54:24 2019 -0400 Merge branch 'master' into merge/postopenib commit 6f002936fc1d08dc3d82190c6997a910b655b59d Author: Aurélien Bouteiller <[email protected]> Date: Sat Mar 9 10:02:59 2019 -0500 Suppress the not useful gotos for error cases that cannot happen issue #40 Signed-off-by: Aurélien Bouteiller <[email protected]> commit 67ae93928ebac0eafd0948cdd5602854fa2d6f07 Author: Aurelien Bouteiller <[email protected]> Date: Thu Mar 7 14:36:28 2019 -0500 Resolve deadlock in MT wait-sync rearming post-error Signed-off-by: Aurelien Bouteiller <[email protected]> commit 804bb69340ca1500828a78f91917a2ea155f256e Author: Thananon Patinyasakdikul <[email protected]> Date: Tue Jan 29 13:34:44 2019 -0500 opal/threads: reverted #6199 This commit reverted pr #6199 as it introduced deadlock in some cases. Also removed the assert as the condition is obsoleted. Signed-off-by: Thananon Patinyasakdikul <[email protected]> commit b7f8c6ffc361d7753abc9b76093582f6f98b52e3 Author: Aurélien Bouteiller <[email protected]> Date: Wed Mar 6 14:33:10 2019 -0500 Rename ftbasic to ftagree Signed-off-by: Aurélien Bouteiller <[email protected]> commit 8b057449f1950e3ff79fd8592a82db78e533948b Author: Aurelien Bouteiller <[email protected]> Date: Fri Feb 22 19:10:35 2019 -0500 Simplify generation of PMPI_xxx_f Fixup ompix_xxx in fortran pmpi interface Signed-off-by: Aurelien Bouteiller <[email protected]> commit 6979f860d08c27aa6dc6a7c6f1ade171bc0c01bf Author: George Bosilca <[email protected]> Date: Thu Feb 21 22:03:22 2019 -0500 Fix the warnings in the Fortran API. Signed-off-by: George Bosilca <[email protected]> commit 11deb93207d786488789811f6641cb68003a9e40 Author: Aurelien Bouteiller <[email protected]> Date: Thu Feb 21 19:56:33 2019 -0500 Erroneous modification in typedef for rdma heartbeats Signed-off-by: Aurelien Bouteiller <[email protected]> commit c0f544b690e850ff8ec164ee90ab0dd006f0e941 Author: George Bosilca <[email protected]> Date: Thu Feb 21 19:56:09 2019 -0500 Prevent EPIPE on OSX. Signed-off-by: George Bosilca <[email protected]> commit 96be67d66ff6e7656c879ddf0c2605a86f45cf3c Author: George Bosilca <[email protected]> Date: Thu Feb 21 19:52:52 2019 -0500 Address a race condition in libevent select. This is not really a fix for the race condition because I could not figure out how it happen, but it does address the problem generated by the race. If we do not remove a bad fd from the select list we keep getting the same error from select, and we stop doing any progress on the communication side. Thus, we forcefully disable all bad fd as soon as select fails, and we are back in track, progress ensure and everything seems to work as expected (no leftover events in the event base). Signed-off-by: George Bosilca <[email protected]> commit eab20ba06442936293d21cae78e03c7c68f500b3 Author: Aurelien Bouteiller <[email protected]> Date: Thu Feb 21 19:33:54 2019 -0500 resolve pedantic warnings in PMPI fortran ulfm bindings Signed-off-by: Aurelien Bouteiller <[email protected]> commit eb55ffb189cbb77a52f38943ab44427752f4af39 Author: Aurelien Bouteiller <[email protected]> Date: Thu Feb 21 19:00:32 2019 -0500 Remove pedantic warnings in ERA agreement commit eb85245b30f5cb885a87da40b8f671d56cc6236b Author: Aurelien Bouteiller <[email protected]> Date: Thu Feb 21 17:58:36 2019 -0500 OPAL_ENABLE_MULTI_THREADS does not exist anymore also fix a number of warning in enable-picky in detector/propagators commit 04b0a92b540b2163b37f840bc3f35b2992567de4 Author: Aurelien Bouteiller <[email protected]> Date: Fri Jan 4 15:44:40 2019 -0500 The order of the attribute creation is important Signed-off-by: Aurelien Bouteiller <[email protected]> commit c87d9483ad9799b6d3b7a6d48770ee2fd74b7855 Merge: edf88350 8a18a831 Author: Aurelien Bouteiller <[email protected]> Date: Fri Jan 4 13:35:19 2019 -0500 Merge remote-tracking branch 'ulfm2/ulfm' into ulfm commit 8a18a831dab6161e19b64f17c7640b8eb3a03188 Merge: d19c4a82 8ad77b66 Author: Nathan Weeks <[email protected]> Date: Fri Jan 4 18:23:06 2019 +0000 Merged in nathanweeks/ulfm2/issue/use-mpi (pull request #11) Fix INTENT of flag argument to MPIX_Comm_[i]agree Approved-by: Aurelien Bouteiller <[email protected]> commit 8ad77b66a9d45dc8c73c25e0a321725d8e8b0689 Author: Nathan Weeks <[email protected]> Date: Fri Jan 4 10:18:36 2019 -0600 Fix INTENT of flag argument to MPIX_Comm_[i]agree Signed-off-by: Nathan Weeks <[email protected]> commit edf88350a8b46fe92cf40a72266685ecbbeccad3 Merge: d19c4a82 0dc0d77b Author: Aurelien Bouteiller <[email protected]> Date: Thu Jan 3 13:51:47 2019 -0500 Merge branch 'master' into ulfm commit d19c4a82df7d79285aa5d39cbb2ea1507898f65f Author: Aurelien Bouteiller <[email protected]> Date: Thu Jan 3 12:17:59 2019 -0500 Handle the case where the bridge comm is revoked in get_rprocs Signed-off-by: Aurelien Bouteiller <[email protected]> commit 383b889df896e5059c2542b439bfb7f6846c4422 Merge: 2c536936 ce61988c Author: Aurelien Bouteiller <[email protected]> Date: Fri Dec 21 21:40:37 2018 +0000 Merged in abouteiller/ulfm2/feature/isrevoked (pull request #9) Adding 'is_revoked' functions for communicators commit ce61988ca8ed085ae999fa6866b5459d8952c756 Author: Aurelien Bouteiller <[email protected]> Date: Fri Dec 21 16:34:05 2018 -0500 Correct F08 and other bindings for is_revoked Signed-off-by: Aurelien Bouteiller <[email protected]> commit 6c7f413ad17c3232c811b14ffa00ddeb3d2dd1c4 Author: Aurelien Bouteiller <[email protected]> Date: Mon Mar 26 12:28:30 2018 -0400 Adding 'is_revoked' functions for communicators commit 2c536936a337d2e7508213a95724bf8f9c9c6239 Author: Aurelien Bouteiller <[email protected]> Date: Fri Dec 21 15:26:44 2018 -0500 Rename README to README.ompi Signed-off-by: Aurelien Bouteiller <[email protected]> commit 9f2d068ee078fa2aaba725010d0cb70b4c5ddb3c Author: Aurelien Bouteiller <[email protected]> Date: Fri Dec 21 15:24:32 2018 -0500 More README renaming for Bitbucket Signed-off-by: Aurelien Bouteiller <[email protected]> commit 9861c014cb5f19b356a67982c22295fd1da7fc8d Author: Aurelien Bouteiller <[email protected]> Date: Fri Dec 21 15:14:01 2018 -0500 Move the Open MPI README so the ULFM readme gets rendered from the bitbucket page Signed-off-by: Aurelien Bouteiller <[email protected]> commit e8127fc61c0ed677c1061e3e788623e61299992c Merge: ec5675fc cc16badc Author: Aurelien Bouteiller <[email protected]> Date: Fri Dec 21 19:57:21 2018 +0000 Merged in abouteiller/ulfm2/topic/usepmpi (pull request #10) F08 and PMPI for the ftmpi bindings Approved-by: Aurelien Bouteiller <[email protected]> commit cc16badc25a81f05c7e9c0dd646d5b1dd1599d8c Author: Aurelien Bouteiller <[email protected]> Date: Fri Dec 21 01:47:40 2018 -0500 Add PMPI F08 ftmpi bindings Signed-off-by: Aurelien Bouteiller <[email protected]> commit cd2850fdadb1a0c36dc370f7991ea8f86e1c626a Author: Aurelien Bouteiller <[email protected]> Date: Fri Dec 21 01:15:13 2018 -0500 Correct fortran ftmpi bindings w/o weak symbols Signed-off-by: Aurelien Bouteiller <[email protected]> commit 866d91f2b7cf9a58c2740dcfb3d884451756965d Author: Aurelien Bouteiller <[email protected]> Date: Fri Dec 21 00:19:37 2018 -0500 Upgrade mpiext ftmpi to the new PMPI generation system: Signed-off-by: Aurelien Bouteiller <[email protected]> commit ec5675fc533cc921a4565e8bde28238dcbfdc6ce Merge: dbcfc7a9 14eec9a3 Author: Nathan T. Weeks <[email protected]> Date: Fri Dec 21 07:10:49 2018 +0000 Merged in nathanweeks/ulfm2/feature/mpi_f08 (pull request #6) Add mpi_f08 bindings for ULFM routines Approved-by: George Bosilca <[email protected]> commit dbcfc7a986eba5dbc6ce7c590b232697739567b2 Author: Aurelien Bouteiller <[email protected]> Date: Tue Dec 18 13:54:08 2018 -0500 Upgrade the ftmpi extension to the new naming scheme; restore pcollreq since it does not cause problem anymore Signed-off-by: Aurelien Bouteiller <[email protected]> commit 5170d9cb7f12ca882790c22544ef18448ceb3860 Merge: f00c5732 6f5f3110 Author: Aurelien Bouteiller <[email protected]> Date: Tue Dec 18 11:26:36 2018 -0500 Merge branch 'master' into ulfm commit f00c5732902e2d8cbd033083248b1b9cca992d5b Author: Aurelien Bouteiller <[email protected]> Date: Sat Nov 3 11:29:03 2018 -0400 Disable pcoll for the time being it breaks the fortran bindings commit e24ddc24977e91a44fbcf352dd3156cc7eb35e0c Author: Aurelien Bouteiller <[email protected]> Date: Fri Nov 2 00:44:47 2018 -0400 update version string and changelog commit 6304043d40daf6759960814975e0f964f3c117bb Author: Aurelien Bouteiller <[email protected]> Date: Fri Nov 2 00:43:27 2018 -0400 Set sane default components commit bbb19203bda985f96ec608b9e24178e74926b540 Merge: 77f9157e 37954b5f Author: Aurelien Bouteiller <[email protected]> Date: Thu Nov 1 15:18:45 2018 -0400 Merge branch 'master' into ulfm commit 77f9157ea7dcb5c2b517455c9e249b6b8068fa5d Author: Aurélien Bouteiller <[email protected]> Date: Wed Oct 31 12:51:11 2018 -0400 Resolve a recursive destruct on the iof proct in finalize Signed-off-by: Aurélien Bouteiller <[email protected]> commit 3ef11c7d09adaa47d76db72dc58a661b89e571fd Author: Aurelien Bouteiller <[email protected]> Date: Wed Oct 24 02:03:24 2018 -0400 Prevent errmgr invokation from crashing in finalize Signed-off-by: Aurelien Bouteiller <[email protected]> commit 86985a5b61e2ccc60bbe938e81d947684d12c8f2 Author: Aurelien Bouteiller <[email protected]> Date: Fri Jan 26 15:23:19 2018 -0500 Re-add the Handle error cases in TCP BTL rejected in upstream When an error is returned by the socket operations, trigger the appropriate error path in the PML to give an opportunity for rerouting/error handling. Signed-off-by: Aurelien Bouteiller <[email protected]> commit 33b8fce232b233a3b0ed519802eb15eb7e5995ab Merge: 6566fc4c a1e85b03 Author: Aurelien Bouteiller <[email protected]> Date: Tue Oct 30 17:04:11 2018 -0400 Merge branch 'master' into ulfm Signed-off-by: Aurelien Bouteiller <[email protected]> commit 6566fc4c68ff0d89d68abdfd8382b411104b47d6 Author: Aurélien Bouteiller <[email protected]> Date: Tue Oct 23 22:42:35 2018 -0400 Correctly propagate the oversubscribe flag to the spawnees Signed-off-by: Aurélien Bouteiller <[email protected]> commit 07df428c2f82718133d707c5f017f417c07e3bd8 Author: Aurelien Bouteiller <[email protected]> Date: Mon Oct 22 15:38:31 2018 -0400 The error field of requests needs to be rearmed at start, not at create Signed-off-by: Aurelien Bouteiller <[email protected]> commit 359f044b4d2cac87fcbb55411c642bb108dcf720 Author: Aurelien Bouteiller <[email protected]> Date: Mon Oct 22 11:25:01 2018 -0400 Correctly bubble up errors in NBC collective operations Signed-off-by: Aurelien Bouteiller <[email protected]> commit 9579efaeca2ccdfb553cbf122755571e8af970fe Author: Aurelien Bouteiller <[email protected]> Date: Mon Oct 22 11:17:00 2018 -0400 Bugfix a debug statement calling pml dump Signed-off-by: Aurelien Bouteiller <[email protected]> commit 428f3506927497ed09f7ad1d97c0e5fbfb4adf67 Author: Aurelien Bouteiller <[email protected]> Date: Thu Oct 18 10:56:44 2018 -0400 Disable inband PML error reporting during MPI Finalize as it interferes with the Finalize process. A better fix is being worked on upstream, but lets have it work in the meantime. Signed-off-by: Aurelien Bouteiller <[email protected]> commit ce72ffb4a76e6d33f4e12f8aa4cba93115009c2f Merge: d9284a60 69f9da91 Author: Aurelien Bouteiller <[email protected]> Date: Thu Oct 4 12:38:26 2018 -0400 Merge branch 'master' into ulfm commit d9284a6005c2e2c615d19903a6d819f126d735c7 Author: Aurelien Bouteiller <[email protected]> Date: Wed Sep 26 10:52:29 2018 -0400 A pmix_3x constant was still present. commit bc26604d3ed16b73ff8f1f756adf965d194272fe Merge: 908eead4 3f598e9e Author: Aurelien Bouteiller <[email protected]> Date: Mon Sep 24 17:40:15 2018 -0400 Merge branch 'master' into ulfm commit 908eead4aedf95a5e565bf4f9af5ac2ccd2494f9 Merge: 70ee1f45 1ca6f38e Author: Aurelien Bouteiller <[email protected]> Date: Tue Aug 7 13:30:54 2018 -0400 Merge remote-tracking branch 'ulfm2/ulfm' into ulfm commit 70ee1f452b40f0ac7e2b319cfc478859a3fffe21 Merge: e87f595e ae030146 Author: Aurelien Bouteiller <[email protected]> Date: Mon Aug 6 14:01:18 2018 -0400 Merge branch 'master' into ulfm Heavy modifications in nbc error management and coll tags commit 1ca6f38ea8a3d0d26efd4a7e755c7edc17bc8e47 Merge: e87f595e 4d129617 Author: Aurelien Bouteiller <[email protected]> Date: Tue May 1 14:08:46 2018 +0000 Merged in abouteiller/ulfm2/feature/pubsub (pull request #5) Do not disable publish/subscribe for no good reason: these are local operations. Approved-by: George Bosilca <[email protected]> commit 14eec9a3d164cc68d92844fc219f0664aa36fd90 Author: Nathan T. Weeks <[email protected]> Date: Tue Feb 27 18:56:56 2018 -0800 Add mpi_f08 bindings for ULFM routines Signed-off-by: Nathan T. Weeks <[email protected]> commit e87f595e6bf1ab2366c10f05d3aac0217079d68c Merge: 63e0514d df0ccbee Author: Aurelien Bouteiller <[email protected]> Date: Thu Mar 1 11:07:05 2018 +0000 Merged in abouteiller/ulfm2 (pull request #8) Ulfm commit df0ccbeee3727663a9ddb1a39ca670343f004bb9 Merge: 63e0514d 9944d63d Author: Aurelien Bouteiller <[email protected]> Date: Thu Mar 1 05:38:53 2018 -0500 Merge branch 'master' into ulfm commit 4d12961757171b1aa28b67efc9a40d24266d9998 Author: Aurelien Bouteiller <[email protected]> Date: Wed Feb 21 19:02:42 2018 -0500 Do not disable publish/subscribe for no good reason: these are local operations. Signed-off-by: Aurelien Bouteiller <[email protected]> commit 63e0514db046de8665f2f3510fab7e739a93a7c2 Author: George Bosilca <[email protected]> Date: Fri Feb 16 01:55:29 2018 -0500 Fix usage of OPAL_ENABLE_FT_MPI. Signed-off-by: George Bosilca <[email protected]> commit cec02d4408489cc24ae5d4dd69476d6e33c5fab9 Author: Aurelien Bouteiller <[email protected]> Date: Wed Feb 14 16:18:57 2018 -0500 bugfix: missing declarations for *ft_register_params commit 6006795e842354b2bbf9308ee119e2dcaf1848a7 Author: Aurelien Bouteiller <[email protected]> Date: Wed Feb 14 16:18:16 2018 -0500 NBC_Error does not have an int as first param commit ac6bb3ea190e3f441d025d398a711dbd22e2a4b3 Author: Aurelien Bouteiller <[email protected]> Date: Tue Feb 13 17:45:58 2018 -0500 Further tuning of the timeout default value for the thread detector commit 577c61693c4d10dded6c5d4e4f909caf9794bad3 Author: Aurelien Bouteiller <[email protected]> Date: Mon Feb 12 14:53:15 2018 -0500 Wrong number of params to NCB_DEBUG commit e6cf7dc044a9f84aaab4c41ebfab27029f12972e Author: Aurelien Bouteiller <[email protected]> Date: Mon Feb 12 14:52:59 2018 -0500 wrong encoding commit 228c12add80446de2220f8f9761ff260a3cd2034 Author: Aurelien Bouteiller <[email protected]> Date: Mon Feb 12 13:11:16 2018 -0500 Expose the FT and detector controls to the enduser in ompi_info Signed-off-by: Aurelien Bouteiller <[email protected]> commit 713c94e85a141772fad8a4cb2842e643b9f22716 Author: George Bosilca <[email protected]> Date: Sun Feb 11 22:23:38 2018 -0500 Fix ULFM profiling. Signed-off-by: George Bosilca <[email protected]> commit 7a42d912261b62082b9e8d8e6586ba4f3dac8ee9 Author: Aurelien Bouteiller <[email protected]> Date: Thu Feb 1 01:43:02 2018 -0500 Erroneous merge in comm_cid: uninitialized epoch commit 8e940d2938e4dc236bd4acfae4e3678de9a71810 Author: George Bosilca <[email protected]> Date: Mon Jan 29 13:48:13 2018 -0500 Minor fixes to make clang happy. Signed-off-by: George Bosilca <[email protected]> commit 11e6355b5a4aeacdb19d9b3dd6c4bd7863834cb2 Merge: 17d0158a 5b0df815 Author: Aurelien Bouteiller <[email protected]> Date: Thu Jan 25 11:03:42 2018 -0500 Merge branch 'master' into ulfm commit 17d0158a45fb08fcad202a9352729fae829f68d1 Author: Aurelien Bouteiller <[email protected]> Date: Wed Jan 17 16:35:28 2018 -0500 bugfix: any-source request completed meanwhile it was reported PROC_FAILED_PENDING needs to see its status rechecked commit 51bbd220c75ca59f230e9729836dcc33a20313a6 Merge: 199f5f0d f3a096dd Author: Nathan T. Weeks <[email protected]> Date: Wed Dec 20 00:31:59 2017 +0000 Merged in nathanweeks/ulfm2/issue/comm_failure_get_acked-f90 (pull request #3) Correct type of MPI_Comm_failure_get_acked failedgrp argument in Fortran USE mpi interface Approved-by: George Bosilca <[email protected]> commit f3a096dda733cbdd3f91524fd9973af5ba41e7d1 Author: Nathan T. Weeks <[email protected]> Date: Tue Dec 12 19:18:41 2017 -0800 Correct type of MPI_Comm_failure_get_acked failedgrp argument in Fortran USE mpi interface commit 199f5f0d2d6139460d0461cbf4b374d117dac4f6 Author: Aurelien Bouteiller <[email protected]> Date: Mon Nov 20 15:19:48 2017 -0500 Make sure we mark the proc as WAITPID status in signalled and non-zero exit cases commit e3006cafe4f9e4e55774679199b94b1e3d24ca5d Author: George Bosilca <[email protected]> Date: Fri Nov 3 23:47:16 2017 +0000 No accents in the names commit 2e75c73cc620eceb7396e9aac77a13e235c2a77b Author: Aurelien Bouteiller <[email protected]> Date: Fri Nov 3 18:59:52 2017 -0400 Tweak default FD and update readme notes commit f4bd88c98f1936a609e9145cd506b22a5722fa90 Author: Aurelien Bouteiller <[email protected]> Date: Fri Nov 3 18:59:22 2017 -0400 Pass correct arguments to pmix cb when out of memory commit 87d50db1d34695a97de094977f7fa9163c35b14e Author: Aurelien Bouteiller <[email protected]> Date: Thu Nov 2 10:48:10 2017 -0400 Changing the default IB retry timeouts is not a good idea. We'll need to find another way to speedup credit recovery in failure cases. commit 2fb5440a589baf8666f6cf30992b3a3bd04a6aca Author: Aurelien Bouteiller <[email protected]> Date: Wed Nov 1 10:07:39 2017 -0400 Mark the IB endpoint as failed when invoking an error; this resolves UDCM connection deadlocks commit 79aca0bb799f90f53c949e161b9f173c1fca2996 Author: Aurelien Bouteiller <[email protected]> Date: Tue Oct 31 23:20:31 2017 -0400 Make it compile in non-debug builds commit 04f61d22769f13adcfec822f83bc5ec079501a62 Author: Aurelien Bouteiller <[email protected]> Date: Tue Oct 31 22:55:51 2017 -0400 bugfix: major: openib send credits returned correctly after a fault for pending frags to dead processes; also tweak the default IB retry timeouts tomake this happen faster commit 942b0ab8bd8fc5f9e0b39312553c3a42228720c4 Author: Aurelien Bouteiller <[email protected]> Date: Tue Oct 31 22:02:24 2017 -0400 Bugfix: leaking frags after failure in TCP btl commit 6db29438a0299f779b59972ae6528a035ff56348 Author: Aurelien Bouteiller <[email protected]> Date: Mon Oct 30 21:34:04 2017 -0400 Copyrights since 1624f1f5 commit 5dd7d6fc35e1398e12338ecc49eadf30aa818a8d Author: Aurelien Bouteiller <[email protected]> Date: Mon Oct 30 21:21:37 2017 -0400 bugfix: returning ERR_PROC_FAILED from iSend violates ULFM spec. commit 9bf3923d51dcf876f1c20a01757cd94dbde9022a Author: Aurelien Bouteiller <[email protected]> Date: Mon Oct 30 17:01:33 2017 -0400 Bugfix to upstream: do not return ERR_IN_STATUS from collectives commit 954cd2f53e9c2985a21bbb1fc374b83678df8f8c Author: Aurelien Bouteiller <[email protected]> Date: Mon Oct 30 16:28:38 2017 -0400 Bugfix: capture cases where ERR_UNREACH is returned instead of PROC_FAILED when the BTL finds the failure first commit 61c5954fc1aff273a40c213d38e850862e9bf7e7 Author: Aurelien Bouteiller <[email protected]> Date: Fri Oct 27 17:30:23 2017 -0400 Fix error cases in TCP connect_ack commit 0237a70791b7b9d6f8b657e1a647b3b0dfab935f Author: Aurelien Bouteiller <[email protected]> Date: Fri Oct 27 17:27:48 2017 -0400 Various fixes to orte/pmix so that late notifications do not crash during finalize commit afe72afab6f873a66c9f257ac8d1e36f32627882 Author: Aurelien Bouteiller <[email protected]> Date: Fri Oct 27 17:24:29 2017 -0400 Turn of ftmpi_enabled after the FD is turned off. commit 9712330b37fb8d5b7f1f77e79efe0a0f6c695ade Author: Aurelien Bouteiller <[email protected]> Date: Fri Oct 27 17:07:55 2017 -0400 Fallback to abort when pml finds an error and ftmpi_enable is false commit a9ec68580d3fddd436b5df3b31e0621ba5d11f77 Author: Aurelien Bouteiller <[email protected]> Date: Wed Oct 25 16:46:42 2017 -0400 Bugfix: interrupt operations on localcomm in failed/revoked intercomms commit 8bacc1491355d4369251d45fa2e9db0e7647d05e Author: Aurelien Bouteiller <[email protected]> Date: Thu Oct 19 12:58:11 2017 -0400 Adjust init slack Signed-off-by: Aurelien Bouteiller <[email protected]> commit 1624f1f521bcd24978370ce614889fb01841ea8c Merge: 768e6f5c 689f1be9 Author: Aurelien Bouteiller <[email protected]> Date: Thu Oct 19 12:23:44 2017 -0400 Merge branch 'master' into ulfm commit 768e6f5c563bc4575fc3dd50313d0136958dd863 Author: Aurelien Bouteiller <[email protected]> Date: Thu Oct 19 12:19:28 2017 -0400 Resolve a case where the detector creates an event with infinite period commit 252544f8e4493ac5c2478f6d5322757168a67869 Merge: e3fff257 27eb401a Author: Aurelien Bouteiller <[email protected]> Date: Mon Oct 16 15:54:06 2017 -0400 Merge branch 'master' into ulfm commit e3fff257517996f5758cedb7c6f6082f9e18a6da Author: Aurelien Bouteiller <[email protected]> Date: Mon Oct 16 13:37:59 2017 -0400 Disable XPmem as it doesn't work with recovery commit d105a9f951a27bde804f3b9398e1e97acf894763 Author: George Bosilca <[email protected]> Date: Wed Oct 4 19:41:09 2017 -0400 Pass OMPI CFLAGS to libevent. Signed-off-by: George Bosilca <[email protected]> commit 250892aaa815e4f5b2e9692dd51f81fc4f47b733 Author: Aurelien Bouteiller <[email protected]> Date: Tue Oct 3 19:55:13 2017 -0400 Bugfix: permit detection of multiple failures on the same node commit 914fcbda90ac1b00d47dea7808e7cdfb48e73bba Author: Aurelien Bouteiller <[email protected]> Date: Tue Oct 3 11:16:36 2017 -0400 File had been added by mistake commit 9540a2c7ccb901119bebbc0be6edc9b0e6b86c76 Merge: 16221bf5 a3ac67be Author: Aurelien Bouteiller <[email protected]> Date: Tue Oct 3 10:16:20 2017 -0400 Merge branch 'master' into ulfm commit 16221bf5d7c312532230b2fabb891791327c5118 Author: Aurelien Bouteiller <[email protected]> Date: Mon Oct 2 13:33:30 2017 -0400 Bugfix: cleanup half created comms when failures strike in comm_dup and friends commit d04eb935478fa3afc1975aa7de0119d398e9772d Author: Aurelien Bouteiller <[email protected]> Date: Fri Sep 29 00:20:58 2017 -0400 Silence too verbose messages in libnbc commit 3ab5df55dbd087423acb7c87ba34ada99a6752b6 Author: Aurelien Bouteiller <[email protected]> Date: Thu Sep 28 23:40:12 2017 -0400 Interrupt the getnextcid_nb when a failure disrupts it. commit 2609388abeaadcaf6095130499c60bfc46ba4a00 Author: Aurelien Bouteiller <[email protected]> Date: Thu Sep 28 23:39:21 2017 -0400 Propagate error codes from NBC to upper layers. commit f679439e032eb3f03dc9afcdd62c2eae686bdb46 Author: Aurelien Bouteiller <[email protected]> Date: Wed Sep 27 17:52:30 2017 -0400 Start from known failures rather than acked failures in comm_free agree Signed-off-by: Aurelien Bouteiller <[email protected]> commit b63b7c15a1139395ff56f3fb448efea56dc7de91 Author: George Bosilca <[email protected]> Date: Wed Sep 27 01:08:05 2017 -0400 Use the correct header. Signed-off-by: George Bosilca <[email protected]> commit b4535b770b197e6c278340ffbff5891401e294c0 Merge: d888d603 7cb22e1b Author: Aurelien Bouteiller <[email protected]> Date: Mon Sep 25 23:18:14 2017 -0400 Merged perf/shrink_remembers into ulfm commit 7cb22e1b6bec7b3fd71aeff0bc7d737a5838dabe Author: Aurelien Bouteiller <[email protected]> Date: Mon Sep 25 23:07:46 2017 -0400 Perf: start shrink from known failures commit fecf5707a2882701f9435b25a487e1cb1aa8be9b Author: Aurelien Bouteiller <[email protected]> Date: Mon Sep 25 23:07:02 2017 -0400 Bugfix: revoke should not revoke NBCs pertaining to shrink commit d888d6035f5b9e41ef39b76a1709522b3652f890 Author: Aurelien Bouteiller <[email protected]> Date: Fri Sep 22 17:50:42 2017 -0400 Perf: decrease fd_finalize duration Signed-off-by: Aurelien Bouteiller <[email protected]> commit b064faf15c6349ffd5e4bf51b72960a77a7cfbf7 Author: Aurelien Bouteiller <[email protected]> Date: Fri Sep 22 11:56:19 2017 -0400 Bugfix: deadlock in finalize may happen if the fault detector is turned off while the last ERA is ongoing commit 024b90109cec452a249b0e2abee8b1c947141650 Author: Aurelien Bouteiller <[email protected]> Date: Fri Sep 22 11:54:46 2017 -0400 Bugfix: thread safety needs to reload and recheck the proc when observer changes commit 9eb779f8c8e9d3a53c9c159944fa83613be9e0e0 Author: George Bosilca <[email protected]> Date: Fri Sep 22 12:41:08 2017 -0400 Support barriers with 1 proc communicators. Make sure the barrier supports being called with a communicator of size 1. Signed-off-by: George Bosilca <[email protected]> commit f403bef6c2ec9f757881b13deaaed4c790b6bcf7 Author: Aurelien Bouteiller <[email protected]> Date: Thu Sep 21 16:15:46 2017 -0400 Bugfix: reset the req_complete field when redoing a wait_sync after a failure (Issue #19) commit 06bb8ed210288a0554897b872ee9a31c1766464a Merge: 79efd24f ab68aced Author: Aurelien Bouteiller <[email protected]> Date: Thu Sep 21 16:11:41 2017 -0400 Merge remote-tracking branch 'origin/heads/master' into ulfm commit 79efd24fe8f975f39b0d4bd61ee3e4dc2a99dd6d Author: Aurelien Bouteiller <[email protected]> Date: Tue Sep 12 20:42:32 2017 -0400 Bugfix: compilation problems --without-ft commit 88bae3699c36b5e9aec90b36ca313ed9ca6a3f74 Author: Aurelien Bouteiller <[email protected]> Date: Tue Sep 12 14:05:00 2017 -0400 Bugfix: simplified handling of --with-ft options commit 9ec76f804313215fe8d43c73579d8e06f501cc20 Author: Aurelien Bouteiller <[email protected]> Date: Mon Sep 11 18:04:10 2017 -0400 Remove the agreement in finalize. commit e856ed3b54e93384d756fb791866ea8a55b8c68d Author: Aurelien Bouteiller <[email protected]> Date: Thu Sep 7 17:22:59 2017 -0400 Removing finalize deadlocks from known problems commit 9d9aa8808500e3192633887c65f73d4d7e789abb Author: George Bosilca <[email protected]> Date: Thu Sep 7 21:13:53 2017 +0000 Update the README. commit ea42a96e2a84814d9d8f35b285ff6479e7a87db9 Author: Aurelien Bouteiller <[email protected]> Date: Wed Sep 6 16:40:43 2017 -0400 Fix: post-failure deadlocks in Finalize, and control FT with --disable-recovery rather than esotheric mca params. commit d37ac65a2acedb70e55176267c1586a39baf62fd Author: Aurelien Bouteiller <[email protected]> Date: Fri Sep 1 19:08:55 2017 -0400 bugfix: finalize detector after all but 1 rank died. commit 3eb197625d2f49d7da0fe268d044b0a6997e09f9 Author: Aurelien Bouteiller <[email protected]> Date: Thu Aug 31 16:53:50 2017 -0400 cleanup: remove dead code in finalize commit da229614428d6646ca5da3e91a93ba45f2be45f2 Author: Aurelien Bouteiller <[email protected]> Date: Thu Aug 31 16:20:39 2017 -0400 bugfix: redo the wait_sync_mt when a global sync interrupts another request commit 8285f9d3466919f8838609e3f054df229baa16c9 Author: Aurelien Bouteiller <[email protected]> Date: Thu Aug 31 16:14:36 2017 -0400 Bugfix: prevent updating the failed_grp from multipe threads commit 5a565247c83a20dfd684876acba1fa7633629ad0 Author: Aurelien Bouteiller <[email protected]> Date: Thu Aug 31 13:49:20 2017 -0400 Bugfix in detector finalization commit 6fcd853ff8b45ae599883d7bf76675ac969db52e Merge: 42a3858d d06b989d Author: Aurelien Bouteiller <[email protected]> Date: Tue Aug 29 02:37:54 2017 +0000 Merged in abouteiller/ulfm2/feature/README (pull request #2) Put README.ULFM in markdown and make it a self-contained install/getting started commit d06b989d277925f98a5575cf629b3c8c53c705ff Author: Aurelien Bouteiller <[email protected]> Date: Mon Aug 28 22:33:40 2017 -0400 Put README.ULFM in markdown and make it a self-contained install/getting started commit 42a3858df24fc3b2047e20b95797a3f2b80fef3b Merge: 97070faf 1434c0e6 Author: Aurelien Bouteiller <[email protected]> Date: Mon Aug 28 22:12:11 2017 +0000 Merged in abouteiller/ulfm2/feature/README (pull request #1) Feature/README commit 1434c0e61793f5b3e543fe6b0151e665c6e525f5 Author: Aurelien Bouteiller <[email protected]> Date: Mon Aug 28 17:02:26 2017 -0400 Update the README commit 23798cf84e35a99735a237226bab5fd811809bfd Author: Aurelien Bouteiller <[email protected]> Date: Thu Aug 10 11:25:04 2017 -0400 Update README commit d685eba8805da4617b604cdd7a1f72584537c7c4 Author: Aurelien Bouteiller <[email protected]> Date: Tue Aug 8 13:43:41 2017 -0400 Adding a README that's specific to ULFM It combines the old NEWS-ulfm from ULFM1 INSTALL from Open MPI applies directly so no need for one commit 97070faf87190faf6c50ea0a0a8557e94ec51775 Author: Aurelien Bouteiller <[email protected]> Date: Mon Aug 28 15:47:51 2017 -0400 topo aware FD does not observe same-node sibling commit 938e0174959a3187037e1ac6356a9f6236fbc8ff Author: Aurelien Bouteiller <[email protected]> Date: Mon Aug 14 23:31:36 2017 -0400 Reduce noise and some finalize conditions in comm_detector commit 08c6f2d6e97ffe36389261edcbdd99f9a4ed38eb Author: Aurelien Bouteiller <[email protected]> Date: Mon Aug 14 23:29:27 2017 -0400 Reduce verbosity for events that are "normal" in FT with CMA commit 4fbd4d36933f2401330862429d420f8b179470ed Author: Aurelien Bouteiller <[email protected]> Date: Mon Aug 14 18:05:11 2017 -0400 Fallback to pmix abort if ompi abort cannot be issued commit baf523d73922b9e00c4c9f44b2de34283e0d2ebb Author: Aurelien Bouteiller <[email protected]> Date: Mon Aug 14 17:24:40 2017 -0400 Orte reports ERR_UNREACH or ERR_PROC_ABORTED when it detects local failures, take both into account. commit f4513c3458e44fcd0aa6db8dbd77c553572bbe2d Author: Aurelien Bouteiller <[email protected]> Date: Mon Aug 14 14:54:51 2017 -0400 Bug in upstream: cannot call ompi_abort from a pmix cb commit 8af800522eaf727c7f8ca8726cb7285765019483 Author: Aurelien Bouteiller <[email protected]> Date: Wed Aug 9 19:29:31 2017 -0400 Re-enable the TOPO graph operations, and trigger an appropriate warning when FT is enabled at the same time commit 1fc9c039585983eddf0c9cafc9176e253a82a26e Author: Aurelien Bouteiller <[email protected]> Date: Wed Aug 9 19:14:41 2017 -0400 Re-enable the RMA OSC operations, and trigger an appropriate warning when FT is enabled at the same time commit 0214c850587a9dc4c1f18d086c6ae76c9c5fef3d Author: Aurelien Bouteiller <[email protected]> Date: Wed Aug 9 18:34:08 2017 -0400 Re-enable files for non-FT runs, and generate an appropriate warning about what happens when using files and failures happen commit b20bd7c70eee582a93428e810e625bda829e975b Author: Aurelien Bouteiller <[email protected]> Date: Wed Aug 9 11:48:48 2017 -0400 Make --with-ft=mpi on by default on this fork commit 59fca1bc961668069f79f14baffc708c65b80869 Author: Aurelien Bouteiller <[email protected]> Date: Thu Jul 27 21:15:26 2017 -0400 make the sync_wakeup work in multithreaded runs commit 4f917d9863037e3522637350ebda4109a37a5c46 Author: Aurelien Bouteiller <[email protected]> Date: Thu Jul 27 20:46:31 2017 -0400 Proper cleanup of rdma registrations commit ad86f26cb16fcd530d7a4f265d60a4f5dedb7f64 Author: Aurelien Bouteiller <[email protected]> Date: Wed Jul 26 11:50:02 2017 -0400 Restore --with-ft option and enable vader BTL from changes in upstream commit 6f9abef3d444e05be2f664e4659f7fb4422e8350 Author: Aurelien Bouteiller <[email protected]> Date: Thu May 25 09:03:57 2017 -0400 Move proc_failed checks outside of the conditional check_args block commit c9783c52ade667362194da90b1132ca5afbb58a5 Author: Aurelien Bouteiller <[email protected]> Date: Thu May 25 09:40:29 2017 -0400 Add support for neighboring colls and other MPI 3.1 stuff cart/graph create commit 35cb76303963ec83aeb27c2109b374163d57c0f6 Author: Aurelien Bouteiller <[email protected]> Date: Wed May 24 13:04:00 2017 -0400 Make sure we do not initialize ERA and failure detector if FT is not requested; and fix a number of bugs when FT is not requesteed. commit 12de8f950596b9f0d93d4aa301dbdbb0f0179b7c Author: Aurelien Bouteiller <[email protected]> Date: Tue May 23 13:25:57 2017 -0400 An error introduced during rebase commit dbb86cb9cd68e7953a92728e2a9ee9fa15df3cd5 Author: Aurelien Bouteiller <[email protected]> Date: Mon May 22 13:31:55 2017 -0400 Remove the opal_array comm_epoch as it is not needed anymore commit 777b04cd67c8da1bbe95551ddd62b3bc1afd9a18 Author: Aurélien Bouteiller <[email protected]> Date: Tue Apr 18 15:05:22 2017 -0400 Missing an extern commit 471f3121ce75a4405d11349789c9c356ebe7b5c5 Author: Aurélien Bouteiller <[email protected]> Date: Mon Apr 17 23:54:42 2017 -0400 The epoch overflow check must happen after the cid overflow check commit ec2286eb1463f0486ad062b62ff505904e25a236 Author: Aurélien Bouteiller <[email protected]> Date: Mon Mar 27 16:35:16 2017 -0400 Reconcile the FT coll components with the new coll initialization (coll. become coll->) commit 9727e60ec6b9554f52532042fed928f389d6ac3c Author: Aurélien Bouteiller <[email protected]> Date: Mon Mar 27 14:58:58 2017 -0400 Update the nobuild list commit f787b5d78cec90fd73e4fba888297fd936f9ae75 Author: Aurélien Bouteiller <[email protected]> Date: Fri Feb 24 17:42:56 2017 -0500 Adding a default no-build list for known problematic components. commit ae557eedf91760e30bfbdd919156a756509c600d Author: Aurélien Bouteiller <[email protected]> Date: Wed Feb 15 17:54:33 2017 -0500 coll_base_module has been updated to 2_2_0 commit 5fd144f316023c37562fe4b09e8d266ebab613b0 Author: Aurélien Bouteiller <[email protected]> Date: Thu Jan 26 14:56:55 2017 -0500 Importing change from ULFM1 94f1fb9 (malloc(0) in ERA) commit 8d49d0ac9b20d8b519493aece25a61153ff275a2 Author: Aurélien Bouteiller <[email protected]> Date: Thu Jan 26 14:53:36 2017 -0500 The pmix-errhandler integration is not completely ready for prime yet commit 022897b5b589011b7c076759ca2a6b2b51c8ec86 Author: Aurélien Bouteiller <[email protected]> Date: Thu Jan 26 14:52:47 2017 -0500 Convert the agreement in finalize to the new signature and stronger sync before turning off the detector commit c67edf45b10b6fbae68d4bafc2c95a079932c703 Author: Aurélien Bouteiller <[email protected]> Date: Thu Nov 10 15:11:05 2016 -0500 Permit interruption of the wait_sync in case of errors commit c631c5599c2ec87588b2f3d3f059d83bf77b4f35 Author: Aurélien Bouteiller <[email protected]> Date: Mon Nov 7 15:30:12 2016 -0500 Fix iagree by making the need to update of the failed_group a parameter commit ebc714d3284aa99b94e65558e62bfa6ba01ac068 Author: Aurélien Bouteiller <[email protected]> Date: Thu Nov 3 14:42:08 2016 -0400 Restoring the errhandler/errmgr interaction to capture errors commit 4fa09b9780c94b5766cf0d522fdc974352926da3 Author: Aurélien Bouteiller <[email protected]> Date: Mon Oct 31 15:37:22 2016 -0400 cid_ft functions are operational again, shrink fixed. commit 15b5a4d1b13730929c46b23da442f26a4b88cc48 Author: Aurélien Bouteiller <[email protected]> Date: Thu Oct 20 16:44:54 2016 -0400 Make sure that the rbcast/detector tags are initialized before progressing the engine. commit da21280dffb6ba0252de0d04e02566e0b96e7000 Author: Aurélien Bouteiller <[email protected]> Date: Fri Oct 14 19:20:20 2016 -0400 We can save an agreement in finalize if we take care of ignoring stray rbcast at this time commit 355b4b0b2796bbcb0d2d4b6d07f16980346d4b0b Author: Aurélien Bouteiller <[email protected]> Date: Wed Oct 12 21:19:38 2016 -0400 Make errors detected in NBC collectives complete the operation, and stop COMM_COLL requests commit 8a88a81a83027f4aee446d77e0c074657f37a4b3 Author: Aurélien Bouteiller <[email protected]> Date: Wed Oct 12 21:19:04 2016 -0400 Some more REQUEST_COMPLETE fixes commit 461d209343f51021557c1f6f11d05911c4134d5a Author: Aurélien Bouteiller <[email protected]> Date: Wed Oct 12 18:34:39 2016 -0400 request_testall/some returns ERR_PROC_FAILED and REVOKED just like request_waitall/some (the mpi layer takes care of setting it to IN_STATUS again later).. commit 7481f709f8d360f07f9082d13fae1c67c7b7219b Author: Aurélien Bouteiller <[email protected]> Date: Wed Oct 12 18:33:45 2016 -0400 use REQUEST_COMPLETE in send_cancel commit 79fd44e4f550076f57e4b101e7f347a47ac013dc Author: Aurélien Bouteiller <[email protected]> Date: Wed Oct 12 17:42:55 2016 -0400 free_reqs does cancel the requests, so its replacement code should too. commit f710a951e1c36a9663575533f05cd59b43f85a33 Author: Aurélien Bouteiller <[email protected]> Date: Wed Oct 12 01:31:09 2016 -0400 Put back epochs in cid allocation commit 95e86ae7fb3fef16b0e5fdf2eff7b98eb4af28f1 Author: Aurélien Bouteiller <[email protected]> Date: Wed Oct 5 18:08:16 2016 -0400 gen_cid must set req_mpi_object.comm commit dfd07582e1c131aacc14ab3b81bde1f5745fce07 Author: Aurélien Bouteiller <[email protected]> Date: Tue Oct 4 22:44:45 2016 -0400 rebase on master commit 2da1b6bceb6ec52ac496f5a25101358e13db0892 Author: George Bosilca <[email protected]> Date: Mon May 9 12:25:01 2016 -0400 Add FT to summary. commit fce79759f7dcb58efa19b4948e6b66ada9807bb1 Author: Aurélien Bouteiller <[email protected]> Date: Fri May 6 17:03:20 2016 -0400 This hack has been committed by mistake commit e507d83a66155df1fe5196228068f90a4132387f Author: Aurélien Bouteiller <[email protected]> Date: Thu May 5 10:37:50 2016 -0400 Do the finalize in abort only if there were actual failures during the run commit c1eb96a997625129ccc4a690892f2f9e742ac245 Author: Aurélien Bouteiller <[email protected]> Date: Tue May 3 17:34:58 2016 -0400 Using the new OPAL_ENABLE_THREAD_MULTI where applicable and removing some useless rmb() Using the new OPAL_ENABLE_THREAD_MULTI where applicable and removing some useless rmb() Using OPAL_ENABLE_MULTI_THREADS and removing some useless rmb() commit d4a91a0a9607c10c52ee7a739754e10a16035a47 Author: Aurélien Bouteiller <[email protected]> Date: Tue May 3 17:34:14 2016 -0400 Fix a bug where the rank of immediate neighbors in the BMG where incorrectly computed commit 4ed5a3779c8295b501cf589bf520843b8fcdc7c8 Author: Aurélien Bouteiller <[email protected]> Date: Tue May 3 16:10:05 2016 -0400 reinstate the abort in finalize, as the fix pushed by ralph is not always working commit b31892772e4518e26603d4289fc3f0a57af2ef5f Author: Aurélien Bouteiller <[email protected]> Date: Tue May 3 16:06:44 2016 -0400 We need to synchronize before removing the FD callbacks commit 9dfe5dfc7200692330b939fcdb8965aac25b50fb Author: Aurélien Bouteiller <[email protected]> Date: Tue May 3 16:05:30 2016 -0400 Keep searching for the next hop in the ring of the BMG when it is found dead during a comm commit 9dad817b402de6c5a725de582cffb20f12f1ae54 Author: Aurélien Bouteiller <[email protected]> Date: Tue Apr 5 16:55:23 2016 -0400 Various Cray XK fixes commit 3921fbc38a9a913367f5c8c94682f4198aec6e06 Author: Aurélien Bouteiller <[email protected]> Date: Fri Apr 1 13:53:32 2016 -0400 Make the revoke ring more reliable Still not perfect as we do no reemit for failures detected after the initial post commit de1a5ce9b13f87aa5367ca5305a389ae56f8822b Author: Aurélien Bouteiller <[email protected]> Date: Fri Apr 1 13:52:53 2016 -0400 Adding a small injection facility to the interface (non-standard, for testing only) commit b92d0997b567ef8e14abd4e76124568e049b6589 Author: Aurélien Bouteiller <[email protected]> Date: Thu Mar 31 14:50:17 2016 -0400 Do not do extra stuff in Finalize when disable_ftmpi commit 6cd0d7cbe3e1ed2c436c4f98ece4ca57e9242da3 Author: Aurélien Bouteiller <[email protected]> Date: Thu Mar 31 01:39:37 2016 -0400 More thread safety in error reporting paths commit 8b8b3c2c8d120e9aa5141338bcc8c95e43d79397 Author: Aurélien Bouteiller <[email protected]> Date: Tue Mar 29 08:59:19 2016 -0400 various debugging stuff commit b6e6156d0026bf54897c20477887738462f5fbfd Author: Aurélien Bouteiller <[email protected]> Date: Mon Mar 28 15:59:27 2016 -0400 Move back these things in finalize to make sure they happen before we tear down BTL etc. commit 6b12383734735130b6a31ee2d5af6b63bf8ae6bd Author: Aurélien Bouteiller <[email protected]> Date: Fri Mar 25 08:40:12 2016 -0400 fix the FD thread sync variable being optimized out in -O3 commit c76d0e968d2c61a3f630680f615293217f48b015 Author: Aurélien Bouteiller <[email protected]> Date: Tue Mar 22 17:55:53 2016 -0400 rdma based heartbeat now works commit a3b35cc4d946cc7cf9a2af7afcfcb934d9a47a35 Author: Aurélien Bouteiller <[email protected]> Date: Tue Mar 22 09:52:24 2016 -0400 Adding RDMA based heartbeats commit 7a2603b2fe55f8e69ce80c6dbaace5ac3d37f7b8 Author: Aurélien Bouteiller <[email protected]> Date: Tue Mar 15 23:59:53 2016 -0400 Adding a thread to the FD. This cause a race in add_procs. commit 97f59ac7666d7430f2b50c16b411f7455552c3ba Author: Aurélien Bouteiller <[email protected]> Date: Tue Mar 15 17:41:36 2016 -0400 Detector is complete w/o progress thread. The timer resolution is a bit too coarse and false suspicions are common... commit 809fc3d6300b8333a99178290392ab6fe3b96116 Author: Aurélien Bouteiller <[email protected]> Date: Thu Mar 10 16:28:07 2016 -0500 Adding the fd to this repo. missing the thread and libevent timeout triggers commit e5932ef24746229ea0e2422e91aca3707bff9f32 Author: Aurélien Bouteiller <[email protected]> Date: Tue May 3 11:52:43 2016 -0400 use-mpi extensions should not have a lib.la commit 93699550538bc8800ffcc1fcddd1f6de9d71839c Author: Aurélien Bouteiller <[email protected]> Date: Thu Mar 17 13:45:26 2016 -0400 Fixing some issues in MPI_THREAD_MULTIPLE enabled builds Reinstate the pmix_fence in finalize Remove some duplicate debug messages commit ab485e166f60dccc233a13a4529a3e1012b4f7da Author: Aurélien Bouteiller <[email protected]> Date: Thu Mar 10 16:58:52 2016 -0500 Move the initialization/finalization of the revoke/rbcast etc in comm_init This initialization done elsewhere commit c1744c01ad087a97cd6f03bf3fe6fa669acb049e Author: Aurélien Bouteiller <[email protected]> Date: Thu Mar 10 16:28:51 2016 -0500 Fix the global variable warning with the failed_group commit 0822fe56dbe9dfcff02d5d784be033067332bf88 Author: Aurélien Bouteiller <[email protected]> Date: Wed Mar 9 14:57:03 2016 -0500 Silence warnings about failed TCP connections, which is a normal situation w/FT commit 84596b8aba2972047d8afeef0e1c334df2b02e63 Author: Aurélien Bouteiller <[email protected]> Date: Wed Mar 9 11:32:30 2016 -0500 Make sure we do not try to cancel completed requests commit d93e6289cc6f55a88649c77d1d9d4ffd581a6404 Author: Aurélien Bouteiller <[email protected]> Date: Wed Mar 9 10:59:54 2016 -0500 Make the CID collective tags part of the colletive tag namespace commit 840cc828916574a2ab8b051bc688efeb7d6c27fc Author: Aurélien Bouteiller <[email protected]> Date: Wed Mar 9 10:58:14 2016 -0500 Correctly promote ERR_PROC_FAILED_PENDING to PROC_FAILED for blocking operations and complete the request commit 0b5fcf45acf860bd3bc74eb1503b37d85cc33aff Author: Aurélien Bouteiller <[email protected]> Date: Tue Mar 8 17:22:23 2016 -0500 Fix a bug in intercomm_create and enable error returning from low level comms in all cases commit 71c0c65699f095c5a0aa7cc4982e113c40075769 Author: Aurélien Bouteiller <[email protected]> Date: Wed Mar 2 09:17:38 2016 -0500 cumulative copyright update commit b0138ff5141506a3ac1b6b060849bc9ba6b91df4 Author: Aurélien Bouteiller <[email protected]> Date: Wed Mar 2 01:45:56 2016 -0500 Disable auto-cleanup in orte to better test survivability of MPI layer. orte finalize is broken. commit 16dff489177807122a83f1e8b0004bbc7abf8ff5 Author: Aurélien Bouteiller <[email protected]> Date: Tue Mar 1 23:11:54 2016 -0500 The base logic for shrink_inter is there. As soon as cid_reduce_inter_ft is implemented it should work. commit db2a955f388916e61321cb9bcf683750d5191a01 Author: Aurélien Bouteiller <[email protected]> Date: Tue Mar 1 23:04:49 2016 -0500 Fix a bug in shrink where the failed group was used partially uninitialized commit 60469079c3bc2fdb08129174a5a2188b711a2418 Author: Aurélien Bouteiller <[email protected]> Date: Tue Mar 1 18:46:31 2016 -0500 Cleanup cruft from jjh original prototype commit 8187bcbd9139f9d52c27d14d3f86175b5edf9338 Author: Aurélien Bouteiller <bouteill…
Support allreduce non-contiguous datatype
We have a program that tests for the size returned from MPI_Pack_external_size with the external32 data representation. It should return the same value for both 32-bit and 64-bit applications, but it is returning different values.
The text was updated successfully, but these errors were encountered: