Releases: geopm/geopm
Releases · geopm/geopm
Version 3.1.0
Fri May 17 2024 Christopher M Cantalupo [email protected] v3.1.0
- Official v3.1.0 release tag
- ABI bump moving so-version from 2.0.0 -> 2.1.0 with backward compatibility for release v3.0
- Support for building on non-x86 CPU architectures
- Support for CPU frequency metrics and controls through standard Linux cpufreq sysfs interfaces
- Support GPU features through standard Linux DRM sysfs interfaces
- Support for LevelZero RAS signals #3155
- Update packaging to comply with standards
- Support for Rocky Linux packaging
- Implement versioning solution for python packages that works with python v3.6 - v3.11
- Setting the GEOPM_PROGRAM_FILTER environment variable is now a requirement for libgeopm to register a process for profiling
- Clarify copyright documentation
- Improve and publish OpenSSF scorecard
- Documentation and web page improvements
- Release distinct packages for documentation
- Improved error messaging
- Update IOGroup and Agent tutorial
- Remove dated runtime tutorial
- Reorganize source code repository directory structure
- Improve github CI automation
- Run Coverity static analysis as part of CI workflow
- Add package.sh script to build all of the repository packages
- Remove all use of autotools in python build and packaging
- Update integration tests to run on a wider range of systems
- Allow push_signal/control() for previous requests after read/write_batch()
- Use libz crc32 implementation to replace direct call to intrinsic
- Add performance test for the GEOPM Service
- Add upstream openmp.m4 macro from fossies
- Fix issues by deleting topology cache file when geopmd starts up
- Fix issues with installed headers: removing unwanted dependencies and specifying public symbol visibility
- Fix issue when --disable-systemd configure option is provided #3289
- Fix issue with SaveControl class in cases where controls are pruned from support at runtime
- Fix issues running geopmctl as root #3352 (not regression from 3.0.1)
- Fix SysfsIOGroup batch write issue #3388 (not regression from 3.0.1)
- Fix static analysis issues (not regression from 3.0.1)
Version 3.0.1
- Wed Dec 06 2023 Christopher M Cantalupo [email protected] v3.0.1
- Hotfix for v3.0.0 release.
- Fix missing systemd dependency on the msr-safe systemd service. This bug could cause MSRs to be unavailable from the GEOPM Service if load order is incorrect.
- Fix systemd unit definition to maintain same model for GPUs/chip topology when linked against versions of libze_loader.so where "COMPOSITE" is not the default.
- Fix security issue where UID 0 was being used to indicate privilege, switched to using libcap for capabilities checks instead.
- Fix bug in startup that was causing long delays when initializing batch interface of PlatformIO
- Fix potential lock when creating PlatformTopo object as user with CAP_SYS_ADMIN.
- Fix several build and packaging issues that could cause problems when dependency packages are not installed to standard locations.
- Fix "make coverage" build target dependency
- Fix issue with sphinx documentation generation
- Fix regression in support for client Intel platforms.
- Fix install failures on some SLES systems by modifying helper install script to prefer the zypper command to the rpm command.
- Add documentation for non-MPI application integration test for GEOPM Runtime.
Version 3.0.0
- Wed Oct 25 2023 Christopher M Cantalupo [email protected] v3.0.0
- Official v3.0.0 release tag.
- GEOPM Runtime support for non-MPI applications.
- Integration with OpenPBS through plugins and launcher support.
- Security improvements and bug fixes.
- Additional GEOPM Service DBus APIs to support application profiling.
- Communication between controller and application is managed by GEOPM Service.
- Creation of topo-cache and responsibility for determining system topology is managed by GEOPM Service.
- Update C++ standard requirement to C++17.
- Add more signals and controls including GPU and platform features.
- ConstConfigIOGroup uses JSON file to define constant settings/configurations as signals.
- Increase the sample period of the monitor agent from 5 ms to 200 ms to reduce default CPU requirements of runtime.
- Add Sapphire Rapids server (SPR) as a supported platform.
- Removal of libgeopmpolicy.so, use libgeopm.so instead.
- Removal of geopmdpy.runtime module: no support for python based agents.
- GEOPM_PERIOD / --geopm-period sets the sample period for controller in units of seconds.
- GEOPM_INIT_CONTROL / --geopm-init-control to write a batch of controls at application startup.
- GEOPM_CTL_LOCAL / --geopm-ctl-local disable controller's use of MPI.
- GEOPM_PROGRAM_FILTER / --geopm-program-filter to select processes for profiling.
- GEOPM_NUM_PROC sets number of processes per node for controller process to track.
- geopmlaunch support for PALS.
- geopmlaunch --geopm-preload option required for ld preloading libgeopm.so, not on by default.
- Default for --geopm-ctl is now "application".
- geopmlaunch does not control CPU affinity application by default (--geopm-affinity-enable now required).
- Debian / Ubuntu packaging support.
- Renamed runtime packages for all distros.
- Improvements for NVML and LevelZero support for GPUs.
- Documentation improvements including "Quick Start Guide"
- Improved error and warning messages.
- ABI so-version for libgeopm and libgeopmd increased to 2.0.0.
- Added --direct option for geopmaccess.
- Add GPU-CA agent for beta testing.
- Add FFNet agent for beta testing.
- Add CPU-CA agent for beta testing.
- FrequencyMapAgent can now control GPU frequency.
- Configuration and plugin directories for GEOPM renamed and combined.
- Add PBS integration for power capping clusters.
- Fuzz test integration and support for sanitizer builds.
- The environment of controller determines output file paths, not the application environment.
- Support for liburing for batching kernel I/O.
- Python interface for endpoint in beta.
- Program name is no longer the default profile name, "default" is used instead.
- Track time spent in MPI_Init*() by the application.
- Removed nearly all use of the /tmp directory (topo-cache still created in /tmp if GEOPM Service is not running)
- More detailed and accurate reporting of GEOPM overhead, MPI overhead, and controller startup time.
- Generic runner for GEOPM experiment infrastructure.
- MSR, NVML and LevelZero IOGroups not loaded except when user has CAP_SYSADMIN or through the GEOPM Service.
Version 2.0.2
- Wed Mar 29 2023 Brad Geltz [email protected] v2.0.2
- Hot fix 2 for release 2.0.
- Add security.md doc for vulnerability reporting.
- Align behavior of secure_make_dirs() to documentation w.r.t. intermediate directories.
- Includes bug fixes and documentation improvements.
- Fix constness of return value from dgcm_device_pool().
- Fix warning from recent gcc about uninitialized variables.
- Use PALSLauncher on australis.
- PALSLauncher: use list option to cpu-bind
- Fix for suppressed error reporting.
- Fix for SST kernel driver on SLES 15.3.
- Fix for issue where missing data can cause Controller crash.
- Update copyright year to 2023.
- Fix LevelZero exception location.
- Fix error when GPUs are supported by service but not client.
- Swap load order of msr and service iogroups.
- Resolve service integration test issues.
Version 2.0.1
- Wed Jan 25 2023 Christopher M Cantalupo [email protected] v2.0.1
- Hot fix 1 for release 2.0.
- Includes bug fixes and documentation improvements.
- Fix install and packaging of plugin directory (#2823).
- Fixes for IMPI mpiexec launch wrapper (#2822, #2820)
- Fix issues discovered in with recent Clang and in the Ubuntu 22 environment (#2829, #2740)
- Better error reporting from geopmd signal handler (#2789).
- Fix for supporting LevelZero when MPI also initializes LevelZero (#2802).
- Better error reporting when application handshake fails (#2801).
- Use multi-user.target in systemd unit file rather than default.
- Fix overwrite of access list with --force option (#2712).
- Use control access list to generate signal list (#2707).
- Fix spelling errors in documentation (#2644).
- Support for recent LevelZero implementations which require user to zero call by reference parameters.
- Better error reporting with LevelZero topology failures.
- Update spec file to make LevelZero inclusion parameterized and suggestions from SUSE maintainers.
- Enable CNLIOGroup by default.
- Fix potential memory issue with CircularBuffer (not exposed by current implementation).
- Use more robust method to obtain sticker frequency.
- Use SKX MSR definitions for newer architectures.
Version 2.0.0
- Wed Aug 24 2022 Christopher M Cantalupo [email protected] v2.0.0
- Official v2.0.0 release tag.
- Provides the GEOPM Systemd Service.
- Removes Python 2 support, only supporting Python 3.
- Support for GPUs from Intel and NVIDIA.
- Support for the isst_interface driver.
- Support for new server processors including Sky Lake, Cascade Lake and Ice Lake.
- Support for Cray Linux energy counters.
- Higher performance / lower latency profile interface.
- More consistent naming scheme for PlatformIO signals and controls.
- Extended set of signals and controls provided by PlatformIO.
- Removed msr-safe requirement though GEOPM Service features.
- Support for new HPC runtime launchers (pals, impi).
- Flexible YAML report generation and parsing that may contain arbitrary content.
- Extended python interface support including Reporter features.
- Python based agents for prototyping runtime algorithms that do not require application feedback.
- Removed Energy Efficient Agent (will be replaced in a future release).
- Documentation and web page improvements.
- Other improvements and feature additions.
GEOPM 2.0.0 Release Candidate 3
- Tue Aug 16 2022 Christopher M Cantalupo [email protected] v2.0.0+rc3
- Release candidate 3 for version 2.0
- This is a pre-release version of GEOPM that has all features that will be present in the v2.0.0 release.
- No changes other than documentation and possible bug fixes are expected prior to v2.0.0.
- This represents a code freeze and version 2.0 is anticipated soon after this release.
- All feedback about this release candidate is appreciated: https://geopm.github.io/contrib.html
GEOPM 2.0.0 Release Candidate 2
- Fri Jul 1 2022 Christopher M Cantalupo [email protected] v2.0.0+rc2
- Release candidate 2 for version 2.0
- This is a pre-release version of GEOPM that has all features that will be present in the v2.0.0 release.
- The names of signals and controls provided by the PIO interface have changed for rc2 as described here: #1671
- Chapter 7 man page documentation has been added for the PlatformIO interface and supported signals and controls.
- Other changes required for version 2.0 have also been made.
- All feedback about this release candidate is appreciated: https://geopm.github.io/contrib.html
GEOPM 2.0.0 Release Candidate 1
- Fri May 27 2022 Christopher M Cantalupo [email protected] v2.0.0+rc1
- Release candidate 1 for version 2.0
- This is a pre-release version of GEOPM that has all features that will be present in the v2.0.0 release.
- Ongoing work for the v2.0.0 release is described in these issues: https://github.com/geopm/geopm/issues?q=is%3Aissue+is%3Aopen+label%3A2.0
- This is the first tagged version of GEOPM that provides the GEOPM Systemd Service: https://geopm.github.io/service.html
- Instructions on how to install the release candidate packages that provide the GEOPM Service are here: https://geopm.github.io/install.html
- The names of signals and controls provided by the PIO interface are expected to change prior to the next release to conform with the requirements described here: #1671
- All feedback about this release candidate is appreciated: https://geopm.github.io/contrib.html
- GEOPM Service RC packages available here: https://software.opensuse.org/download.html?project=home%3Ageopm%3Arelease-v2.0-candidate&package=geopm-service
GEOPM 1.1.0
- Tue Nov 5 2019 Diana Guttman [email protected] v1.1.0
- Release overview:
- Support for Python 3.6 has been added.
- Support for Python 2.7 continues but will be removed in a future release.
- New features targeting integration with resource managers.
- Enhancements to EnergyEfficientAgent.
- Improved support for automatic OpenMP region detection.
- Support for launching with OpenMPI.
- Bug fixes, new and updated tests, and updates to documentation.
- New features:
- GEOPM environment variables can now be initialized from a JSON file.
- Add geopm_agent_enforce_policy() function and Agent::enforce_policy() to public interface.
- Add tracing for the profile table log with GEOPM_TRACE_PROFILE.
- Add REGION_COUNT signal to get number times a region has been seen.
- Add REGION_COUNT signal to default trace columns.
- Add python wrappers for geopm_pio_c, geopm_topo_c, geopm_error_c, and geopm_agent_c interfaces.
- Add format_function() method to IOGroups to get a formatting function from a signal name.
- Add IOGroup for Compute Node Linux PM counters.
- Allow the FrequencyMapAgent to come from the agent's policy rather than the deprecated environment variable.
- Add launcher for OpenMPI.
- New beta features:
- Add geopmconvertreport script to convert report file into yaml and json.
- Add a new error type for data store errors.
- Add PolicyStore class to map agents and profiles to policies.
- Introduce new Endpoint API, which replaces and extends the ManagerIO.
- Implement geopm_endpoint_c API.
- Modified implementations and interfaces:
- Add CSV class to support CSV files created by GEOPM.
- Modify Tracer and ProfileTracer to use the CSV class.
- Add trace_formats() method to Agents.
- Change freq_sweep analysis to use system max frequency for default max.
- Move geopmpy package to 'production' status.
- Minimize set of functions in Environment C interface.
- Change Environment class variable names for better readability.
- Update FrequencyMapAgent to use Environment class for its environment variable.
- Add TEMPERATURE_* signals to list shown by geopmread.
- Change REGION_RUNTIME signal reflect time of outer region only.
- Add MSR turbo ratio limit for KNL.
- Use max turbo ratio limit for platform max frequency.
- Remove ability to write turbo ratio limit.
- Add MPI_Barrier before entering all2all model region.
- Increase problem size of FFT to D class.
- Add IMPI support to tutorials.
- Add feature to geopmagent and Agent interface where partial policies will be completed with NANs.
- Add SLURM -bootstrap option for IMPI.
- Add geopm_time_to_string() to convert a time structure into a string.
- Add write_file() helper function.
- Add value of policy to report, or DYNAMIC when policy comes from an Endpoint.
- Separate Agent creation time from init() in Controller.
- Add DebugIOGroup for extending trace with internal Agent values.
- Add pthread mutex to beginning of SharedMemory regions, with get_scoped_lock() as the only method to lock the mutex.
- Remove pthread mutex from ManagerIO struct.
- Use git ls-tree to generate the MANIFEST in any git repo.
- Remove m_request_s from PlatformIO public interface.
- Change RPM to build libgeopmpolicy only and remove check step.
- Add get_hostnames() method to Controller.
- Add unlink() method to SharedMemory.
- Update VERSION with each call to autogen.sh.
- Do not markup anything in geopmbench if all regions are suffixed with '-unmarked'.
- Update OMPT interface to newest standard.
- Use libdl and libelf to map instruction address to symbol name.
- Remove hard requirement for hosts file usage in tutorials.
- Remove MacOS portability.
- Remove signal handling logic from Controller.
- Change board power min/max/tdp to use sum aggregation.
- Change power cap policy of PowerGovernorAgent and PowerBalancerAgent to POWER_PACKAGE_LIMIT_TOTAL.
- Change "mpi-time" in report to "network-time" and change time to include all network time.
- Rename EPOCH_RUNTIME_MPI signal to EPOCH_RUNTIME_NETWORK.
- Move Environment class definition to header.
- Split geopm_pmpi.c into C/C++ parts.
- Clean up build and run scripts for tutorials.
- Remove region entry and exit lines from the trace by default; they can be added with --enable-bloat.
- Improved error messages and warnings:
- Make prefix of runtime warning strings consistently start with "Warning: ".
- Improve error message when msr driver can't be loaded.
- Print a proper message on failure to launch lscpu job.
- Add more verbose geopm plugin load failure warning.
- Add more detailed description to geopm_error_message() based on last exception thrown.
- Change throw to warning for PowerBalancerAgent running on a single node.
- Fix error message when MSR read fails.
- Extensive changes to EnergyEfficientAgent algorithm:
- Change EE Agent to learn separately for each control domain.
- Add max filtering to EnergyEfficientRegion.
- Use sticker when passing NaN in the policy.
- Add PERF_MARGIN as a policy for EnergyEfficientAgent.
- Do not set frequency for regions shorter than 50 ms or unmarked.
- Have EE Agent always use min frequency for network regions.
- Update EE agent to use region count to detect adjacent regions with same hash.
- Add separate max frequency to use for static policy.
- Bug fixes and refactoring in EnergyEfficientAgent.
- Updates to integration tests:
- Increase iterations for EnergyEfficientAgent test.
- Decrease margin in test for geopm python wrapper measuring time.
- Add a integration test checking that chosen frequencies increase monotonically with CPU-bound time in regions.
- Update integration tests to use new trace file format.
- Add imbalance to power_balancer integration test.
- Refactor report mock functions in integration tests.
- Move integration test helpers into util.py.
- Add integration test for the epoch data in report.
- Add msr save and restore calls to test launcher.
- Updates to unit tests:
- Add unit tests for EnergyEfficientAgent.
- Cleanup environment variables in unit tests.
- Add unit tests for the geopmpy.io module.
- Add unit tests for the geopmpy.launcher module.
- Make profile tests work with different task sets.
- Fix TestAffinity to check for OMP_NUM_THREADS in test setup.
- Fix ExceptionTest to account for extra char in error message.
- Updates to documentation:
- Add Daniel Wilson to the AUTHORS file.
- Change CONTRIBUTING instructions on how to get version.
- Add version to geopm man pages.
- Update man pages and README to describe Environment changes and integration with resource managers.
- Fix PlatformTopo C++ man page to match new interfaces.
- Add section to README about user environment for non-standard install.
- Modify frequency_map man page to use floating point frequencies.
- Rename geopm_pio_c man page to show its section number.
- Add man page for Endpoint class.
- Update endpoint_c man page.
- Remove references to uninstalled man pages from geopm.7.
- Remove specific list of available launchers from geopm.7.
- Add documentation to README for Ubuntu support.
- Add example for systems programmers using PlatformIO.
- Fix typos in documentation.
- Bug fixes:
- Fix paths for building tutorial from module environment.
- Fix Tracer handling of # signals from environment.
- Fix Tracer handling of region hash and hint integers.
- Fix a bug where regions with the same name as the profile did not appear in the report.
- Fix trace file cache loading print in io.py.
- Rename and fix analysis for EE and frequency map agents.
- Fix a bug where LD_PRELOAD was always set.
- Update geopmplotter to sue agents and cosmetic fixes to plots.
- Fix geopm::string_split() so it works with multi-character delimiters.
- Fix build when using --disable-openmp.
- Fix build when using --disable-mpi.
- Fix a bug where launcher did not use srun reservation for geopmread cache.
- Fix placement of verbose flag for geopmbench.
- Fix epoch reporting when there are no regions.
- Fix generation of report hdf5 cache.
- Fix date generation in geopm_time.h.
- Only overwrite roff pages with ronn if the roff page is missing.
- Avoid a buffer overrun when copying cpusets.
- Check if MPI has been finalized before freeing the comm.
- Fix stderr piping in autogen.sh.
- Fix build errors from gcc8.
- Fixes to allow installed headers to be used out of source.
- Fix a bug where tutorial tarball was not built when docs are disabled.
- Remove DRAM power from PowerGovernorAgent samples.
- Avoid loss of precision when converting policies to json strings.
- Do not use GEOPM_REGION_HASH_INVALID in Agent implementations.
- Remove '0x' from IMPI affinity mask.