forked from pmodels/mpich
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathCHANGES
2074 lines (1391 loc) · 82.1 KB
/
CHANGES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
===============================================================================
Changes in 4.3
===============================================================================
# Support MPI memory allocation kinds side document.
# Support MPI ABI Proposal. Configure with --enable-mpi-abi and build with
mpicc_abi. By default, mpicc still builds and links with MPICH ABI.
# Experimental API MPIX_Op_create_x. It supports user callback function with
extra_state context and op destructor callback. It supports language bindings
to use proxy function for language-specific user callbacks.
# Experimental API MPIX_{Comm,File,Session,Win}_create_errhandler_x. They allow
user error handlers to have extra_state context and corresponding destructor.
This allows language bindings to implement user error handlers via proxy.
# Experimental API MPIX_Request_is_complete. This is a pure request state query
function that will not invoke progress, nor will free the request. This should
help applications that want separate task dependency checking from progress
engine to avoid progress contentions, especially in multi-threaded context.
It is also useful for tools to profile non-deterministic calls such as
MPI_Test.
# Experimental API MPIX_Async_start. This function let applications to inject
progress hooks to MPI progress. It allows application to implement custom
asynchronous operations that will be progressed by MPI. It avoids having to
implement separate progress mechanisms that may either take additional
resource or contend with MPI progress and negatively impact performance. It
also allows applications to create custom MPI operations, such as MPI
nonblocking collectives, and achieve near native performance.
# Added benchmark tests test/mpi/bench/p2p_{latency,bw}.
# Added CMA support in CH4 IPC.
# Added IPC read algorithm for intranode Allgather and Allgatherv.
# Added CVAR MPIR_CVAR_CH4_SHM_POSIX_TOPO_ENABLE to enable non-temporal memcpy
for inter-numa shm communication.
# Added CVAR MPIR_CVAR_DEBUG_PROGRESS_TIMEOUT for debugging MPI deadlock issues.
# ch4:ucx now supports dynamic processes. MPI_Comm_spawn{_multiple} will work.
MPI_Open_port will fail due to ucx port name exceeds current MPI_MAX_PORT_NAME
of 256. One can work around by use an info hint "port_name_size" and use a
larger port name buffer.
# PMI-1 defines PMI_MAX_PORT_NAME, which may be different from MPI_MAX_PORT_NAME.
This is used by "PMI_Lookup_name". Consequently, MPI_Lookup_name accepts info
hint "port_name_size" that may be larger than MPI_MAX_PORT_NAME. If the port
name does not fit in "port_name_size", it will return a truncation error.
# Autogen default to use -yaksa-depth=2.
# Default MPIR_CVAR_CH4_ROOTS_ONLY_PMI to on.
# Added ch4 netmod API am_tag_send and am_tag_recv.
# Added MPIR_CVAR_CH4_OFI_EAGER_THRESHOLD to force RNDV send mode.
# Make check target will run ROMIO tests.
===============================================================================
Changes in 4.2
===============================================================================
# Complete support MPI 4.1 specification
# Experimental thread communicator feature (e.g. MPIX_Threadcomm_init).
See paper "Frustrated With MPI+Threads? Try MPIxThreads!",
https://doi.org/10.1145/3615318.3615320.
# Experimental datatype functions MPIX_Type_iov_len and MPIX_Type_Iov
# Experimental op MPIX_EQUAL for MPI_Reduce and MPI_Allreduce (intra
communicator only)
# Use --with-{pmi,pmi2,pmix]=[path] to configure external PMI library.
Convenience options for Slurm and cray deprecated. Use --with-pmi=oldcray
for older Cray environment.
# Error checking default changed to runtime (used to be all).
# Use the error handler bound to MPI_COMM_SELF as the default error handler.
# Use ierror instead of ierr in "use mpi" Fortran interface. This affects
user code if they call with explicit keyword, e.g. call MPI_Init(ierr=arg).
"ierror" is the correct name specified in the MPI specification. We only
added subroutine interface in "mpi.mod" since 4.1.
# Handle conversion functions, such as MPI_Comm_c2f, MPI_Comm_f2c, etc., are
no longer macros. MPI-4.1 require these to be actual functions.
# Yaksa updated to auto detect the GPU architecture and only build for
the detected arch. This applies to CUDA and HIP support.
# MPI_Win_shared_query can be used on windows created by MPI_Win_create,
MPI_Win_allocate, in addition to windows created by MPI_Win_allocate_shared.
MPI_Win_allocate will create shared memory whenever feasible, including between
spawned processes on the same node.
# Fortran mpi.mod support Type(c_ptr) buffer output for MPI_Alloc_mem,
MPI_Win_allocate, and MPI_Win_allocate_shared.
# New functions added in MPI-4.1: MPI_Remove_error_string, MPI_Remove_error_code,
and MPI_Remove_error_class
# New functions added in MPI-4.1: MPI_Request_get_status_all,
MPI_Request_get_status_any, and MPI_Request_get_status_some.
# New function added in MPI-4.1: MPI_Type_get_value_index.
# New functions added in MPI-4.1: MPI_Comm_attach_buffer, MPI_Session_attach_buffer,
MPI_Comm_detach_buffer, MPI_Session_detach_buffer,
MPI_Buffer_flush, MPI_Comm_flush_buffer, MPI_Session_flush_buffer,
MPI_Buffer_iflush, MPI_Comm_iflush_buffer, and MPI_Session_iflush_buffer.
Also added constant MPI_BUFFER_AUTOMATIC to allow automatic buffers.
# Support for "mpi_memory_alloc_kinds" info key. Memory allocation kind
requests can be made via argument to mpiexec, or as info during
session creation. Kinds supported are "mpi" (with standard defined
restrictors) and "system". Queries for supported kinds can be made on
MPI objects such as sessions, comms, windows, or files. MPI 4.1 states
that supported kinds can also be found in MPI_INFO_ENV, but it was
decided at the October 2023 meeting that this was a mistake and will
be removed in an erratum.
===============================================================================
Changes in 4.1
===============================================================================
# Thread-cs in ch4 changed to per-vci.
# Testsuite (test/mpi) is configured separately from mpich configure.
# Added options in autogen to accelerate CI builds, including using pre-built
sub-modules. Added -yaksa-depth option to generate shallower yaksa pup code
for faster build and smaller binaries.
# Support singleton init using hydra.
# On OSX, link option flat_namespace is no longer turned on by default.
# Generate mpi.mod Fortran interfaces using Python 3. For many compilers,
including gfortran, flags such as -fallow-mismatched-args is no longer
necessary.
# Fixed message queue debugger interface in ch4.
# PMI (src/pmi) is refactored as a subdir and can be separately distributed.
# Added MPIX_Comm_get_failed.
# Experimental MPIX stream API to enable explicit thread contexts.
# Experimental MPIX gpu enqueue API. It currently only supports CUDA streams.
# Delays GPU resource allocation in yaksa.
# CH3 nemesis ofi netmod is removed.
# New collective algorithms. All collective algorithms are listed in
src/mpi/coll/coll_algorithms.txt
# Removed hydra2. We will port unique features of hydra2, including
tree-launching, to hydra in the future release.
# Added in-repository wiki documentation.
# Added stream workq to support optimizations for enqueue operations.
# Better support for large count APIs by eliminating type conversion issues.
# Hydra now uses libpmi (src/pmi) for handling PMI messages.
# Many bug fixes and enhancements.
===============================================================================
Changes in 4.0
===============================================================================
# All MPI-4 APIs have been implemented. Major MPI-4 features include MPI
sessions, partitioned point-to-point communications, events in the MPI tool
information interface, large-count functions, persistent collectives,
MPI_Comm_idup_with_info, MPI_Isendrecv and MPI_Isendrecv_replace,
MPI_Info_get_string, MPI_Comm_split_type with new split_type --
MPI_COMM_TYPE_HW_GUIDED and MPI_COMM_TYPE_HW_UNGUIDED.
# Add QMPI (experimental) support.
# Add MPIX_Delete_error_{class,code,string}.
# MPI_Info objects can be accessed before MPI_Init{_thread}.
# Generate C API interface functions including man page notes and error
checking using Python scripts.
# Generate Fortran bindings using Python scripts.
# Generate collective entrance functions and generate per-algorithm tests.
# Support explicit --without-cuda configure option.
# Drop support for UCX version < 1.7.0.
# Configure now optionally require Python 3 (when F08 is enabled).
# Multi-NIC support in ch4:ofi.
# Default to ch4:ofi when configure doesn't have a clear choice. Add message
block at the end of configure to advise user.
# Multiple VCI is fully implemented including the active message fallback paths.
# Extend IPC to support non-contig datatypes.
# Add AMD GPU support using HIP.
# Add generic RNDV callback mechanism with active messages.
# Refactor ch4 dynamic process functions.
# Avoid building MPL and hwloc multiple times.
# Many bug fixes and code clean-ups.
===============================================================================
Changes in 3.4
===============================================================================
# ch4 replaces ch3 as the default device configuration. If no network
module is specified at configuration-time, MPICH will search the
user environment in order to select one to build. The user will be
prompted to choose if no preferred network library is detected.
# Add support for Yaksa datatype engine (default in ch4).
# Add support for GPU buffers (CUDA, Level Zero) in pt2pt,
collectives, and one-sided communication.
# Add support for XPMEM.
# Add support for multiple virtual communication interfaces for more
efficient MPI_THREAD_MULTIPLE (experimental).
# Add DAOS ADIO driver to ROMIO (contributed by Intel).
# Add Quobyte ADIO driver to ROMIO (contributed by Quobyte).
# Add support for Arm compiler toolchain
# Add support for NVIDIA HPC compilers
# Add support for flang/f18 Fortran compiler
# Add support for AddressSanitizer and UndefinedBehaviorSanitizer to
debug configuration
# Remove mxm, llc, and portals4 netmods from ch3.
# Remove support for logical reduction operations on floating point
types.
# Remove MPIX_Mutex interfaces.
# Further improvements to ch4 business card exchange: extra
long address support and fixes for PMIx integration.
# Un-inline non-critical ch4 code for improved build times.
# Fix several test program bugs.
# Fix several static analysis and compiler warnings.
# Change the signature of MPID_Init to include requested and provided
thread levels.
===============================================================================
Changes in 3.3.2
===============================================================================
# Add support for struct sockaddr in MPICH, Hydra, and PMI socket
code. Works with both IPv4 and IPv6 addresses.
# Fix localhost detection on FreeBSD and macOS, avoiding long delay
during startup.
# Fix thread-local storage detection.
# Fix several test program bugs.
# Fix several static analysis and compiler warnings.
===============================================================================
Changes in 3.3.1
===============================================================================
# Fix bug in MPI_Testany/MPI_Waitany that could cause deadlock
# Add missing functionality in Argobots library support
# Fix configure-time detection for thread local storage support
# Better support for reproducible builds. Thanks to Bernhard
Wiedemann for the report and fixes
# Fix support for XL compiler toolchain
# Add support for -static-intel linking option
# Fix building on systems without weak symbols
# Fix several static analysis and compiler warnings
===============================================================================
Changes in 3.3
===============================================================================
# CH4 Device: A new device layer implementation designed for low software
overheads. CH4 has experimental support for OFI and UCX network libraries,
and POSIX shared memory. Thanks to Intel, Mellanox, and RIKEN AICS for
participating in the CH4 coding effort.
# Fixed SLURM integration in Hydra for new node list format.
# Added support for PMIx (https://pmix.github.io/pmix/) client
library in CH4 netmods. Note that you must use a compatible PMIx
server in this configuration.
# Better organization of collectives in the MPI layer. The new
scheme, which de-couples implementation from selection logic,
enables easier integration of additional algorithms.
# TSP collectives framework: A C++-template style framework for
collective algorithms is added to allow single collective
implementation to move data over generic or device-specific
transport functions.
# Improvements to derived datatype testing (DTPools -
https://github.com/pmodels/mpich/blob/main/doc/wiki/design/DTPools.md).
# Added new "non-catastrophic" error codes to expose internal
resource exhaustion.
# Added info hints to MPI_Comm_split_type to support splitting
communicators by machine topology. Both on-node (socket, core,
etc.) and off-node (switch-level) hints are defined.
# Improvements to MPI_THREAD_MULTIPLE in CH4 through new thread safety
models at the Virtual Network Interface (VNI) level. This introduces two
new models that leverage work-queues to offload operations and improve
scalability under contention.
# Message Driven Thread Activation (MDTA). An alternative locking
model is defined for MPI_THREAD_MULTIPLE in CH4.
# Added PMI usage optimizations for business card exchange in CH4
netmods.
# Improvements on MPI_Abort. MPI_Abort invoked on subcommunicators will
only abort the connected processes within that communicator.
`
# Cleanup of whitespace (ch3 excluded) using the
maint/code-cleanup.sh script. For instructions on how to update
PRs/branches based on MPICH before the cleanup, see
https://github.com/pmodels/mpich/wiki/Code-Cleanup-Procedure.
# Removed the PAMI device and poe PMI client.
# C99 compiler support is now required to build MPICH.
# Several other minor bug fixes, memory leak fixes, and code cleanup.
A full list of changes is available at the following link:
http://git.mpich.org/mpich.git/shortlog/v3.2..v3.3
A list of bugs that have been fixed is available at the following
link:
https://github.com/pmodels/mpich/milestone/25?closed=1
===============================================================================
Changes in 3.2
===============================================================================
# Added support for MPI-3.1 features including nonblocking collective I/O,
address manipulation routines, thread-safety for MPI initialization,
pre-init functionality, and new MPI_T routines to look up variables
by name.
# Fortran 2008 bindings are enabled by default and fully supported.
# Added support for the Mellanox MXM InfiniBand interface. (thanks
to Mellanox for the code contribution).
# Added support for the Mellanox HCOLL interface for collectives.
(thanks to Mellanox for the code contribution).
# Significant stability improvements to the MPICH/portals4
implementation.
# Completely revamped RMA infrastructure including several
scalability improvements, performance improvements, and bug fixes.
# Added experimental support for Open Fabrics Interfaces (OFI) version 1.0.0.
https://github.com/ofiwg/libfabric (thanks to Intel for code contribution)
# The Myrinet MX network module, which had a life cycle from 1.1 till
3.1.2, has now been deleted.
# Several other minor bug fixes, memory leak fixes, and code cleanup.
A full list of changes is available at the following link:
http://git.mpich.org/mpich.git/shortlog/v3.1.3..v3.2
A full list of bugs that have been fixed is available at the
following link:
https://trac.mpich.org/projects/mpich/query?status=closed&group=resolution&milestone=mpich-3.2
===============================================================================
Changes in 3.1.3
===============================================================================
# Several enhancements to Portals4 support.
# Several enhancements to PAMI (thanks to IBM for the code contribution).
# Several enhancements to the CH3 RMA implementation.
# Several enhancements to ROMIO.
# Fixed deadlock in multi-threaded MPI_Comm_idup.
# Several other minor bug fixes, memory leak fixes, and code cleanup.
A full list of changes is available at the following link:
http://git.mpich.org/mpich.git/shortlog/v3.1.2..v3.1.3
A full list of bugs that have been fixed is available at the
following link:
https://trac.mpich.org/projects/mpich/query?status=closed&group=resolution&milestone=mpich-3.1.3
===============================================================================
Changes in 3.1.2
===============================================================================
# Significant enhancements to the BG/Q device, especially for RMA and
shared memory functionality.
# Several enhancements to ROMIO.
# Upgraded to hwloc-1.9.
# Added more Fortran 2008 (F08) tests and fixed a few F08 binding bugs.
Now all MPICH F90 tests have been ported to F08.
# Updated weak alias support to align with gcc-4.x
# Minor enhancements to the CH3 RMA implementation.
# Better implementation of MPI_Allreduce for intercommunicator.
# Added environment variables to control memory tracing overhead.
# Added flags to enable C99 mode with Solaris compilers.
# Updated implementation of MPI-T CVARs of type MPI_CHAR, as interpreted
in MPI-3.0 Errata.
# Several other minor bug fixes, memory leak fixes, and code cleanup.
A full list of changes is available at the following link:
http://git.mpich.org/mpich.git/shortlog/v3.1.1..v3.1.2
A full list of bugs that have been fixed is available at the
following link:
https://trac.mpich.org/projects/mpich/query?status=closed&group=resolution&milestone=mpich-3.1.2
===============================================================================
Changes in 3.1.1
===============================================================================
# Blue Gene/Q implementation supports MPI-3. This release contains a
functional and compliant Blue Gene/Q implementation of the MPI-3 standard.
Instructions to build on Blue Gene/Q are on the mpich.org wiki:
https://github.com/pmodels/mpich/blob/main/doc/wiki/source_code/BGQ.md
# Fortran 2008 bindings (experimental). Build with --enable-fortran=all. Must have
a Fortran 2008 + TS 29113 capable compiler.
# Significant rework of MPICH library management and which symbols go
into which libraries. Also updated MPICH library names to make
them consistent with Intel MPI, Cray MPI and IBM PE MPI. Backward
compatibility links are provided for older mpich-based build
systems.
# The ROMIO "Blue Gene" driver has seen significant rework. We have separated
"file system" features from "platform" features, since GPFS shows up in more
places than just Blue Gene
# New ROMIO options for aggregator selection and placement on Blue Gene
# Optional new ROMIO two-phase algorithm requiring less communication for
certain workloads
# The old ROMIO optimization "deferred open" either stopped working or was
disabled on several platforms.
# Added support for powerpcle compiler. Patched libtool in MPICH to support
little-endian powerpc linux host.
# Fixed the prototype of the Reduce_local C++ binding. The previous
prototype was completely incorrect. Thanks to Jeff Squyres for
reporting the issue.
# The mpd process manager, which was deprecated and unsupported for
the past four major release series (1.3.x till 3.1), has now been
deleted. RIP.
# Several other minor bug fixes, memory leak fixes, and code cleanup.
A full list of changes is available at the following link:
http://git.mpich.org/mpich.git/shortlog/v3.1..v3.1.1
A full list of bugs that have been fixed is available at the
following link:
https://trac.mpich.org/projects/mpich/query?status=closed&group=resolution&milestone=mpich-3.1.1
===============================================================================
Changes in 3.1
===============================================================================
# Implement runtime compatibility with MPICH-derived implementations as per
the ABI Compatibility Initiative (see www.mpich.org/abi for more
information).
# Integrated MPICH-PAMI code base for Blue Gene/Q and other IBM
platforms.
# Several improvements to the SCIF netmod. (code contribution from
Intel).
# Major revamp of the MPI_T interface added in MPI-3.
# Added environment variables to control a lot more capabilities for
collectives. See the README.envvar file for more information.
# Allow non-blocking collectives and fault tolerance at the same
time. The option MPIR_PARAM_ENABLE_COLL_FT_RET has been deprecated as
it is no longer necessary.
# Improvements to MPI_WIN_ALLOCATE to internally allocate shared
memory between processes on the same node.
# Performance improvements for MPI RMA operations on shared memory
for MPI_WIN_ALLOCATE and MPI_WIN_ALLOCATE_SHARED.
# Enable shared library builds by default.
# Upgraded hwloc to 1.8.
# Several improvements to the Hydra-SLURM integration.
# Several improvements to the Hydra process binding code. See the
Hydra wiki page for more information:
https://github.com/pmodels/mpich/blob/main/doc/wiki/how_to/Using_the_Hydra_Process_Manager.md
# MPICH now supports operations on very large datatypes (those that describe
more than 32 bits of data). This work also allows MPICH to fully support
MPI-3's introduction of MPI_Count.
# Several other minor bug fixes, memory leak fixes, and code cleanup.
A full list of changes is available at the following link:
http://git.mpich.org/mpich.git/shortlog/v3.0.4..v3.1
A full list of bugs that have been fixed is available at the
following link:
https://trac.mpich.org/projects/mpich/query?status=closed&group=resolution&milestone=mpich-3.1
===============================================================================
Changes in 3.0.4
===============================================================================
# BUILD SYSTEM: Reordered the default compiler search to prefer Intel
and PG compilers over GNU compilers because of the performance
difference.
WARNING: If you do not explicitly specify the compiler you want
through CC and friends, this might break ABI for you relative to
the previous 3.0.x release.
# OVERALL: Added support to manage per-communicator eager-rendezvous
thresholds.
# PM/PMI: Performance improvements to the Hydra process manager on
large-scale systems by allowing for key/value caching.
# Several other minor bug fixes, memory leak fixes, and code cleanup.
A full list of changes is available at the following link:
http://git.mpich.org/mpich.git/shortlog/v3.0.3..v3.0.4
===============================================================================
Changes in 3.0.3
===============================================================================
# RMA: Added a new mechanism for piggybacking RMA synchronization operations,
which improves the performance of several synchronization operations,
including Flush.
# RMA: Added an optimization to utilize the MPI_MODE_NOCHECK assertion in
passive target RMA to improve performance by eliminating a lock request
message.
# RMA: Added a default implementation of shared memory windows to CH3. This
adds support for this MPI 3.0 feature to the ch3:sock device.
# RMA: Fix a bug that resulted in an error when RMA operation request handles
where completed outside of a synchronization epoch.
# PM/PMI: Upgraded to hwloc-1.6.2rc1. This version uses libpciaccess
instead of libpci, to workaround the GPL license used by libpci.
# PM/PMI: Added support for the Cobalt process manager.
# BUILD SYSTEM: allow MPI_LONG_DOUBLE_SUPPORT to be disabled with a configure
option.
# FORTRAN: fix MPI_WEIGHTS_EMPTY in the Fortran bindings
# MISC: fix a bug in MPI_Get_elements where it could return incorrect values
# Several other minor bug fixes, memory leak fixes, and code cleanup.
A full list of changes is available at the following link:
http://git.mpich.org/mpich.git/shortlog/v3.0.2..v3.0.3
===============================================================================
Changes in 3.0.2
===============================================================================
# PM/PMI: Upgrade to hwloc-1.6.1
# RMA: Performance enhancements for shared memory windows.
# COMPILER INTEGRATION: minor improvements and fixes to the clang static type
checking annotation macros.
# MPI-IO (ROMIO): improved error checking for user errors, contributed by IBM.
# MPI-3 TOOLS INTERFACE: new MPI_T performance variables providing information
about nemesis communication behavior and and CH3 message matching queues.
# TEST SUITE: "make testing" now also outputs a "summary.tap" file that can be
interpreted with standard TAP consumer libraries and tools. The
"summary.xml" format remains unchanged.
# GIT: This is the first release built from the new git repository at
git.mpich.org. A few build system mechanisms have changed because of this
switch.
# BUG FIX: resolved a compilation error related to LLONG_MAX that affected
several users (ticket #1776).
# BUG FIX: nonblocking collectives now properly make progress when MPICH is
configured with the ch3:sock channel (ticket #1785).
# Several other minor bug fixes, memory leak fixes, and code cleanup.
A full list of changes is available at the following link:
http://git.mpich.org/mpich.git/shortlog/v3.0.1..v3.0.2
===============================================================================
Changes in 3.0.1
===============================================================================
# PM/PMI: Critical bug-fix in Hydra to work correctly in multi-node
tests.
# A full list of changes is available using:
svn log -r10790:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich-3.0.1
... or at the following link:
https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich-3.0.1?action=follow_copy&rev=HEAD&stop_rev=10790&mode=follow_copy
===============================================================================
Changes in 3.0
===============================================================================
# MPI-3: All MPI-3 features are now implemented and the MPI_VERSION
bumped up to 3.0.
# OVERALL: Added support for ARM-v7 native atomics
# MPE: MPE is now separated out of MPICH and can be downloaded/used
as a separate package.
# PM/PMI: Upgraded to hwloc-1.6
# Several other minor bug fixes, memory leak fixes, and code cleanup.
A full list of changes is available using:
svn log -r10344:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich-3.0
... or at the following link:
https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich-3.0?action=follow_copy&rev=HEAD&stop_rev=10344&mode=follow_copy
===============================================================================
Changes in 1.5
===============================================================================
# OVERALL: Nemesis now supports an "--enable-yield=..." configure
option for better performance/behavior when oversubscribing
processes to cores. Some form of this option is enabled by default
on Linux, Darwin, and systems that support sched_yield().
# OVERALL: Added support for Intel Many Integrated Core (MIC)
architecture: shared memory, TCP/IP, and SCIF based communication.
# OVERALL: Added support for IBM BG/Q architecture. Thanks to IBM
for the contribution.
# MPI-3: const support has been added to mpi.h, although it is
disabled by default. It can be enabled on a per-translation unit
basis with "#define MPICH2_CONST const".
# MPI-3: Added support for MPIX_Type_create_hindexed_block.
# MPI-3: The new MPI-3 nonblocking collective functions are now
available as "MPIX_" functions (e.g., "MPIX_Ibcast").
# MPI-3: The new MPI-3 neighborhood collective routines are now available as
"MPIX_" functions (e.g., "MPIX_Neighbor_allgather").
# MPI-3: The new MPI-3 MPI_Comm_split_type function is now available
as an "MPIX_" function.
# MPI-3: The new MPI-3 tools interface is now available as "MPIX_T_"
functions. This is a beta implementation right now with several
limitations, including no support for multithreading. Several
performance variables related to CH3's message matching are exposed
through this interface.
# MPI-3: The new MPI-3 matched probe functionality is supported via
the new routines MPIX_Mprobe, MPIX_Improbe, MPIX_Mrecv, and
MPIX_Imrecv.
# MPI-3: The new MPI-3 nonblocking communicator duplication routine,
MPIX_Comm_idup, is now supported. It will only work for
single-threaded programs at this time.
# MPI-3: MPIX_Comm_reenable_anysource support
# MPI-3: Native MPIX_Comm_create_group support (updated version of
the prior MPIX_Group_comm_create routine).
# MPI-3: MPI_Intercomm_create's internal communication no longer interferes
with point-to-point communication, even if point-to-point operations on the
parent communicator use the same tag or MPI_ANY_TAG.
# MPI-3: Eliminated the possibility of interference between
MPI_Intercomm_create and point-to-point messaging operations.
# Build system: Completely revamped build system to rely fully on
autotools. Parallel builds ("make -j8" and similar) are now supported.
# Build system: rename "./maint/updatefiles" --> "./autogen.sh" and
"configure.in" --> "configure.ac"
# JUMPSHOT: Improvements to Jumpshot to handle thousands of
timelines, including performance improvements to slog2 in such
cases.
# JUMPSHOT: Added navigation support to locate chosen drawable's ends
when viewport has been scrolled far from the drawable.
# PM/PMI: Added support for memory binding policies.
# PM/PMI: Various improvements to the process binding support in
Hydra. Several new pre-defined binding options are provided.
# PM/PMI: Upgraded to hwloc-1.5
# PM/PMI: Several improvements to PBS support to natively use the PBS
launcher.
# Several other minor bug fixes, memory leak fixes, and code cleanup.
A full list of changes is available using:
svn log -r8478:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.5
... or at the following link:
https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.5?action=follow_copy&rev=HEAD&stop_rev=8478&mode=follow_copy
===============================================================================
Changes in 1.4.1
===============================================================================
# OVERALL: Several improvements to the ARMCI API implementation
within MPICH2.
# Build system: Added beta support for DESTDIR while installing
MPICH2.
# PM/PMI: Upgrade hwloc to 1.2.1rc2.
# PM/PMI: Initial support for the PBS launcher.
# Several other minor bug fixes, memory leak fixes, and code cleanup.
A full list of changes is available using:
svn log -r8675:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.4.1
... or at the following link:
https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.4.1?action=follow_copy&rev=HEAD&stop_rev=8675&mode=follow_copy
===============================================================================
Changes in 1.4
===============================================================================
# OVERALL: Improvements to fault tolerance for collective
operations. Thanks to Rui Wang @ ICT for reporting several of these
issues.
# OVERALL: Improvements to the universe size detection. Thanks to
Yauheni Zelenko for reporting this issue.
# OVERALL: Bug fixes for Fortran attributes on some systems. Thanks
to Nicolai Stange for reporting this issue.
# OVERALL: Added new ARMCI API implementation (experimental).
# OVERALL: Added new MPIX_Group_comm_create function to allow
non-collective creation of sub-communicators.
# FORTRAN: Bug fixes in the MPI_DIST_GRAPH_ Fortran bindings.
# PM/PMI: Support for a manual "none" launcher in Hydra to allow for
higher-level tools to be built on top of Hydra. Thanks to Justin
Wozniak for reporting this issue, for providing several patches for
the fix, and testing it.
# PM/PMI: Bug fixes in Hydra to handle non-uniform layouts of hosts
better. Thanks to the MVAPICH group at OSU for reporting this issue
and testing it.
# PM/PMI: Bug fixes in Hydra to handle cases where only a subset of
the available launchers or resource managers are compiled
in. Thanks to Satish Balay @ Argonne for reporting this issue.
# PM/PMI: Support for a different username to be provided for each
host; this only works for launchers that support this (such as
SSH).
# PM/PMI: Bug fixes for using Hydra on AIX machines. Thanks to
Kitrick Sheets @ NCSA for reporting this issue and providing the
first draft of the patch.
# PM/PMI: Bug fixes in memory allocation/management for environment
variables that was showing up on older platforms. Thanks to Steven
Sutphen for reporting the issue and providing detailed analysis to
track down the bug.
# PM/PMI: Added support for providing a configuration file to pick
the default options for Hydra. Thanks to Saurabh T. for reporting
the issues with the current implementation and working with us to
improve this option.
# PM/PMI: Improvements to the error code returned by Hydra.
# PM/PMI: Bug fixes for handling "=" in environment variable values in
hydra.
# PM/PMI: Upgrade the hwloc version to 1.2.
# COLLECTIVES: Performance and memory usage improvements for MPI_Bcast
in certain cases.
# VALGRIND: Fix incorrect Valgrind client request usage when MPICH2 is
built for memory debugging.
# BUILD SYSTEM: "--enable-fast" and "--disable-error-checking" are once
again valid simultaneous options to configure.
# TEST SUITE: Several new tests for MPI RMA operations.
# Several other minor bug fixes, memory leak fixes, and code cleanup.
A full list of changes is available using:
svn log -r7838:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.4
... or at the following link:
https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.4?action=follow_copy&rev=HEAD&stop_rev=7838&mode=follow_copy
===============================================================================
Changes in 1.3.2
===============================================================================
# OVERALL: MPICH2 now recognizes the OSX mach_absolute_time as a
native timer type.
# OVERALL: Performance improvements to MPI_Comm_split on large
systems.
# OVERALL: Several improvements to error returns capabilities in the
presence of faults.
# PM/PMI: Several fixes and improvements to Hydra's process binding
capability.
# PM/PMI: Upgrade the hwloc version to 1.1.1.
# PM/PMI: Allow users to sort node lists allocated by resource
managers in Hydra.
# PM/PMI: Improvements to signal handling. Now Hydra respects Ctrl-Z
signals and passes on the signal to the application.
# PM/PMI: Improvements to STDOUT/STDERR handling including improved
support for rank prepending on output. Improvements to STDIN
handling for applications being run in the background.
# PM/PMI: Split the bootstrap servers into "launchers" and "resource
managers", allowing the user to pick a different resource manager
from the launcher. For example, the user can now pick the "SLURM"
resource manager and "SSH" as the launcher.
# PM/PMI: The MPD process manager is deprecated.
# PM/PMI: The PLPA process binding library support is deprecated.
# WINDOWS: Adding support for gfortran and 64-bit gcc libs.
# Several other minor bug fixes, memory leak fixes, and code cleanup.
A full list of changes is available using:
svn log -r7457:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.3.2
... or at the following link:
https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.3.2?action=follow_copy&rev=HEAD&stop_rev=7457&mode=follow_copy
===============================================================================
Changes in 1.3.1
===============================================================================
# OVERALL: MPICH2 is now fully compliant with the CIFTS FTB standard
MPI events (based on the draft standard).
# OVERALL: Major improvements to RMA performance for long lists of
RMA operations.
# OVERALL: Performance improvements for Group_translate_ranks.
# COLLECTIVES: Collective algorithm selection thresholds can now be controlled
at runtime via environment variables.
# ROMIO: PVFS error codes are now mapped to MPI error codes.
# Several other minor bug fixes, memory leak fixes, and code cleanup.
A full list of changes is available using:
svn log -r7350:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.3.1
... or at the following link:
https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.3.1?action=follow_copy&rev=HEAD&stop_rev=7350&mode=follow_copy