-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance cmake install instructions with std install location/destination #35
Closed
arvindcheru
wants to merge
71
commits into
ROCm:amd-mainline
from
arvindcheru:arvindcheru/sdk-amd-staging-dev-504870
Closed
Enhance cmake install instructions with std install location/destination #35
arvindcheru
wants to merge
71
commits into
ROCm:amd-mainline
from
arvindcheru:arvindcheru/sdk-amd-staging-dev-504870
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Kernel Serialization Documentation Added docs for kernel serialization. * Update counter_collection_services.md * correcting counter collection mode names * correcting counter collection modes naming --------- Co-authored-by: Gopesh Bhardwaj <[email protected]>
* Adding License for roctx, docs, tests packages * Fixing Docs/ROCTx packages * Fixing roctx path
… support (#1175) * include file and print formatters for OMPT support * Apply suggestions from code review * Remove rocprofiler_ompt_set_callbacks * Reorder ROCPROFILER_EXTERNAL_CORRELATION_REQUEST_OPENMP --------- Co-authored-by: Jonathan R. Madsen <[email protected]> Co-authored-by: Jonathan R. Madsen <[email protected]>
* Timing documentation Update Documentation update for timing differences. Needs additional review from Joe Greathouse before landing. * Update comparing-with-legacy-tools.rst
* Relax timestamp checking - Prevent recurring CI failures that have no remedy until HSA/driver issues are resolved * Replace "cc" abbreviation in tests with "counter-collection" * Update CODEOWNERS to explicitly include jrmadsen for source/include * Extra logging in rocprofiler tool library * Tweak aborted-app test - remove counter collection as part of the test
* Add rocprofv3-multi-node.md to source/lib/rocprofiler-sdk-tool * Initial source re-organization - create "output" static library * Update include/rocprofiler-sdk/cxx/serialization.hpp - add GPR count fields to kernel symbol serialization * Add source/scripts/generate-rocpd.py - reads one or more JSON output files from rocprofv3 and writes rocpd SQLite3 database - Note: preliminary implementation * More reorganization b/t lib/rocprofiler-sdk-tool and lib/output * Updates to generate-rocpd.py - add SQL views - option: --absolute-timestamps -> --normalize-timestamps - option: --generic-markers - misc fixes with regards to getting the views working - support marker names * Update generate-rocpd.py - Add --marker-mode option * Update generate-rocpd.py - Improve debugging of bad bulk SQLite statements * Update rocprofv3-multi-node.md - cleanup of proposed SQL schema * lib/output/format_path.{hpp,cpp} - rename format to format_path (in config.hpp and config.cpp) - move format_path functionality to format_path.{hpp,cpp} * Rework lib/output/tmp_file_buffer.{hpp,cpp} * Update output_key.cpp - support %cwd%, %launch_date% * Rework lib/output/buffered_output.hpp * Support csv_output_file constructed via domain_type * Update lib/output/domain_type.{hpp,cpp} - get_domain_trace_file_name - get_domain_stats_file_name * Update lib/rocprofiler-sdk-tool/tool.cpp - tweak headers * Update lib/output/generate*.cpp - remove include of helpers.hpp - CSV uses domain_type for filenames * Update samples/counter_collection/per_dev_serialization.cpp - make wait_on volatile * Remove tool_table from lib/output and lib/rocprofiler-sdk-tool - Also split various structs into their own files - lib/output/agent_info - lib/output/metadata - lib/output/kernel_symbol_info - lib/output/counter_info - Implemented rocprofiler::tool::metadata * Optimize rocprofiler_tool_counter_collection_record_t - reduce the size of the struct from 24784 bytes to 8376 bytes * Introduced output_config - split subset of config (from tools library) into output_config to be able to configure the output generating functions separately from the tool library - this is a significant step towards the output generating functions not relying on static global memory * Stream chunks of data into output instead of loading all info memory * Remove duplicate group_segment_size in rocprofiler_kernel_dispatch_info_t serialization * Adding Q&A to rocprofv3-multi-node.md * Remove all remaining include lib/rocprofiler-sdk-tool from lib/output - migrated a fair amount of code from lib/rocprofiler-sdk-tool/helper.hpp to lib/output * Update Q&A of rocprofv3-multi-node.md * Fix minor compilation errors + minor cleanup * Update hsa/async_copy.cpp - when ROCPROFILER_CI_STRICT_TIMESTAMPS > 0, reduce the active_signal sync wait time * Update profiling_time.hpp - fix log messages for when start/end time is less/greater than enqueue/current CPU time * Fix generate_stats for tool_counter_record_t * Dictionary optimization for generate-rocpd.py --------- Co-authored-by: SrirakshaNag <[email protected]>
Co-authored-by: Benjamin Welton <[email protected]>
* SWDEV-495725: Skipping metadata init for unsupported agents * Update source/lib/output/metadata.cpp Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> --------- Co-authored-by: mclin <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* add support for select function in derived counters * formatting * renaming select dims variable name from set to map * format * Update doc with select() for dimensions * use : for defining range of values in select dims * - update dimension for metric after select. - make sure to raise runtime error if user provides range for a dimension. * use map instead of unordered_map for select dim info * new line EOF * fix bug: select() operator. * Update evaluate_ast.cpp format * added a check for dim value exceeds max. * Update source/lib/rocprofiler-sdk/counters/evaluate_ast.cpp Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update source/lib/rocprofiler-sdk/counters/evaluate_ast.cpp Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * updated doc with data example for select operation. * changelog.md * Update CHANGELOG.md --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* cache reference nodes * evaluation based on dim args * format * add dimensions for reduce operator * add dimensions for reduce operator * add dimensions for reduce operator docs * add dimensions for reduce operator. * refactor switch cases * Update CHANGELOG.md * updated doc with data example * updated doc with data example for reduce operation. * added fallthrough in switch case sum. * changelog.md * format * fix bug in constuct_test_data()
* Squashed commit of the following: commit b76f2635f4b65599f03812a73d0cf410f5ada213 Author: Mythreya <[email protected]> Date: Fri Apr 26 00:29:09 2024 +0000 Changed for PR feedback commit bedb8ad566ff42fbf117b19202c26c507abcf8ac Author: Jonathan R. Madsen <[email protected]> Date: Thu Apr 25 19:20:06 2024 -0500 Fix installation commit a98f8a69459a1450a1be9c98e20b3c1e7f2568c2 Author: Jonathan R. Madsen <[email protected]> Date: Thu Apr 25 19:16:35 2024 -0500 Restructure the headers commit 46489a020ffafdd5f4ce3f580469ff233ef67fe1 Author: Mythreya <[email protected]> Date: Tue Apr 23 23:31:10 2024 +0000 Update hsa include commit 8e795282cce348fc6aa736b7857b21aeb32aa20a Author: Mythreya <[email protected]> Date: Tue Apr 23 23:02:32 2024 +0000 Report page migration events as start/end * Updated tests accordingly * Page migration events are reported independently commit 8784e5ad4895a626a2a8e4ac12f8021b34172bd4 Author: Mythreya <[email protected]> Date: Tue Apr 16 17:01:57 2024 +0000 Update handling of dropped page migration events Previously, we dropped all locally buffered events when we detect that KFD has dropped some events. This may drop too many pending events too eagerly. When we receive an end event and cannot find the corresponding start, we can be sure that KFD has dropped some events in the immediate past. When this happens, we look through all locally buffered events and report the start events that are older than 10s as partial events --- they have no "end" information (we expect that the end events have been dropped). We also set the polling timeout to 10s to prevent the local buffer from getting too large with events waiting to be paired up. Updated tests commit 2e8e0b07eeda9b5990e1ae8d28dcd3a035ce38e1 Author: Mythreya <[email protected]> Date: Tue Apr 16 17:01:31 2024 +0000 Docs for triggers * Fix page migration sample * Fix hasher, kfd install * Add hsa include * Install KFD include dir * Updates from code review - single timestamp field - node_id -> agent_id - from_node -> from_agent - to_node -> to_agent * Misc revisions * Remove page-migration install target * Update page-migration pytest * Tweak to serialization * Address PR comments * Update page-migration test * Add cli args, update iterations * Address PR comments * Add abi.cpp for static_asserts * Update page_migration gtest with only runtime tests * Moved helpers into utils.hpp --------- Co-authored-by: Jonathan R. Madsen <[email protected]>
* Fix navi3 kernel tracing - conditional aql::set_profiler_active_on_queue only when counter collection is registered * Update changelog * Update following name change
* Adding rccl-dev package in core CI testing * Update continuous_integration.yml
* Initial commit: Need to implement wrapper function to collect data and test that wrapper function is correctly replacing core HSA functions * Attempted to implement wrapper implementation for hsa memory allocation functions. Need to modify generate record files and test if implementation is working as expected * Debugging and implementing generateCSV function * Memory allocation size and starting address outputted to csv and json file formats * Formatting * Initial setup for OTF2 and Perfetto generation * Collecting agent id for memory_allocation and formatting * Modified memory_allocation.cpp to set up code for AMD_EXT commands * Support for memory_pool_allocate added * Removed accidently added file * Made flag optional and added more OTF2 and Perfetto code. Needs testing to ensure perfetto and OTF2 works * Formatting * Fixed perfetto and otf2 output * Fixed flag issue due to incorrect buffer use * Updated documentation * Small cleaning and comments * Added test for HSA memory allocation tracing * Fixed summary test validation errors due to allocation tracing. Added type to location_base to create unique event ids for allocation due to OTF2 trace error * Decreased lower limit of hip calls for test * Modified summary tests to vary number of allocate requests * Minor fixes to address comments. Still need to address OTF2 comments * Fix docs and changed OTF2 to use enum for type specified in location_base construction * Fixed schema error * Added vmem command tracking. Need to add test * Updated test to work with vmem command and updated generateCSV to output int instead of hex string. * OTF2 enum update and mispelling fix * CI does not support Virtual Memory API. Removed vmem test. Will add back if CI is modifed to suport vmem API * Update CMakeLists.txt for memory allocation test * Updated summary test * Minor fixes to address comments * Moved domain_type.hpp enum to before LAST * Fixed compile errors and formatting * Fixed stats summary domain name error * Added rocprofv3 test * Page migration test fix * Undo page migration test changes. Failures do not appear to have to do with memory allocation
* Runtime initialization tracing - calbacks and buffer entries notifying when a runtime has been initialized * Minor cleanup to registration.cpp * JSON tool implementation * Increase perfetto_reader timeout * Handle perfetto_reader timeout when attr doesn't exist * clang-tidy fixes to memory_allocation.cpp
* Format rocporfv3 help * python formatter fix
* Host trap PC sampling uses new record type * removing redundant field * formatting * simplifying templates in the parser - no need for HostTrap boolean * reviving some parser tests * hw_id decoding on GFX9 * HW id parser test * parser CID test * Parser multigpu test * removing rocprofiler_pc_sampling_record_t and some fields from hw_id * simplifying parser context * keep bench test internally * initializing gfx9_hw_id_t differently * anonymous struct first * avoiding inlining initialization of struct
… (#1208) * Rebased optizations for rocprofv3 tool * Fixing merge conflicts * Formatting * Open from within mutex * Small name changes * Added operator
* correcting usage example * rccl trace * Adding Navi power state limitation * Addressed feedback * kernel-rename * kokkos trace * more information on kookos tracing * Corecting tool library hardcoding * summary domains * Updating domain stats file * updating images * rocprofv3 default behavior update * Removing README from API documentation * Added missing description in Topics * Fixed wrong rendering of README in API document * Fixing Topics in API docs * Removing API doc for details/rccl.h * Addressed review comments
Adds rocprofiler_load_counter_definition. This function allows a counter definition file to be supplied to rocprofiler-sdk directly. Takes in a string containing the counter definition YAML, its size (in bytes), and a flag value to state whether this is an append operation or not. --------- Co-authored-by: Benjamin Welton <[email protected]> Co-authored-by: Jonathan R. Madsen <[email protected]> Co-authored-by: usrihari123 <[email protected]>
Update continuous_integration.yml Update continuous_integration.yml Adding EMU Runners Update continuous_integration.yml Update continuous_integration.yml Bump thollander/actions-comment-pull-request from 2.5.0 to 3.0.1 Bumps [thollander/actions-comment-pull-request](https://github.com/thollander/actions-comment-pull-request) from 2.5.0 to 3.0.1. - [Release notes](https://github.com/thollander/actions-comment-pull-request/releases) - [Commits](thollander/actions-comment-pull-request@v2.5.0...v3.0.1) --- updated-dependencies: - dependency-name: thollander/actions-comment-pull-request dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Update continuous_integration.yml Update continuous_integration.yml Update run-ci.py Update upload-image-to-github.py Update continuous_integration.yml Update continuous_integration.yml Update continuous_integration.yml Update continuous_integration.yml Update continuous_integration.yml using github output Update continuous_integration.yml Revert temp change Update continuous_integration.yml Update continuous_integration.yml
# Create PSDB.yml enabling psdb for github emu staging branch ## What type of PR is this? (check all applicable) - [ ] Refactor - [x] Feature - [ ] Bug Fix - [ ] Optimization - [ ] Documentation Update ## Technical details Moving internal repo from github to github EMU ## Added/updated tests? _We encourage you to keep the code coverage percentage at 80% and above._ - [ ] Yes - [x] No, Does not apply to this PR. ## Updated CHANGELOG? _Needed for Release updates for a ROCm release._ - [ ] Yes - [x] No, Does not apply to this PR. ## Added/Updated documentation? - [ ] Yes - [x] No, Does not apply to this PR. --------- Co-authored-by: Mallya, Ameya Keshava <[email protected]>
Fixed destination for mirror
* Rebased optizations for rocprofv3 tool * Fixing merge conflicts * Formatting * Open from within mutex * Small name changes * Added operator * removed some parameters * Optimizing counter collection * Re-arrange code * Adding back dimension query * Formatting * Update source/lib/rocprofiler-sdk/thread_trace/att_core.cpp Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Formatting 2 * Fix for test compilation * Fix for yield * Adding back check for zero * Improved thread handling * Formatting * Remove automatic start * Adding test * Small fixes * Adding lock for buffer callbacks * Fix for race condition in AST * Adding check for ptr --------- Co-authored-by: Giovanni Baraldi <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
updating roctx documentation for funcitons
Reducing workload parser's in hammer test Reducing hammer test workload by 4 to prevent timeout on ThreadSanitizer job.
… Alloc_Flags. (ROCm#12) * rename csv output header for scratch memmory trace from Alloc_flags to Alloc_Flags. * csv output tests for scratch memory trace. * Check output lengths --------- Co-authored-by: Mythreya <[email protected]>
* Update kfd ioctl header - Adds new event for dropped events - Mirrors kernel update by Philip Yang * Add error code for page migration events - Adds support for new error code field for page migration end events - Page migration end event is now generated for migration failure - Error code is zero for successful migration * Add dropped event SMI event - New event type indicates if events were dropped - Events are dropped if the buffer is full
* Fix page-migration background thread on fork After falling off main in the forked child, all the children try to join on on the parent's monitoring thread. This results in a deadlock. Parent is waiting for the child to exit, but the child is trying to join the parent's thread which is signaled from the parent's static destructors. Even with just one parent and child, due to copy-on-write semantics, a child signalling the background thread to join will still block (thread's updated state is not visible in the child). This fix creates background treads on fork per-child with a pthread_atfork handler, ensuring that each child has its own monitoring thread. * Formatting fixes * Detach page-migration background thread and update test timeout * Attach files with ctest * Update corr-id assert * Tweak on-fork, simplify background thread * Revert thread detach
…ROCm#9) * Adding Trace Period feature to rocprofv3 * Adding feature documentation * Update source/bin/rocprofv3.py Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Fixing format * Moving to Collection Period and changing the input params * Format Fixes * Fixing rebasing issues * Removing atomic include from the tool * Adding more options for units, optimizing the code * Fixing rocprofv3.py * Fixing time conv & adding time controlled app * Fixing format * Changing to shared memory testing methodology * use of shmem use * Fix include headers for transpose-time-controlled.cpp * Format upload-image-to-github.py * Removing shmem and using only env var to dump timestamps from the tool * Tool Fixes + Test Config * Adding Tests * Fixing Review comments * Update trace period implementation * Update trace period tests * check between start and stop timestamps * Merge Fix * Update validate.py * Improve safety of rocprofiler_stop_context after finalization * Pass context id to collection_period_cntrl by value * Adding 20 us error margin * Ensure log level for collection-period test is not more than warning --------- Co-authored-by: Ammar ELWazir <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <[email protected]>
* Ability to select alternative compiler per file Implementation of ompt interface to rocprofiler SDK. task_create and task_schedule are not supported. Misc updates Update OpenMP target sample - samples/ompt -> samples/openmp_target - fix sample test of openmp-target - reorganize files Rework OpenMP implementation Minor OpenMP implementation cleanup Rename samples/openmp_target CMake targets Add tests/bin/openmp - OpenMP target test app in tests/bin/openmp/target Format samples/openmp_target CMakeLists.txt Misc lib/rocprofiler-sdk/openmp cleanup - fix includes - convert_arg Update openmp.def.cpp - tweak includes - remove lots of temporary variables Update samples - common::get_callback_id_names() -> common::get_callback_tracing_names() - add kernel dispatch, memory copy, scratch memory buffered tracing to openmp target sample Fix code object operation names - add "CODE_OBJECT_" prefix Update include/rocprofiler-sdk/openmp/api_id.h - remove spurious comment Miscellaneous openmp updates - similar API for openmp_begin and openmp_end - move implementations of ompt callbacks to openmp.cpp - ompt_{thread_begin,thread_end,parallel_begin,parallel_end}_callbacks are openmp_events [SWDEV-484495] Fix int truncation in CSV output (#1098) CSV output truncates doubles to ints when it shouldn't. Derived metrics are (mostly) doubles and lose precision (or become worthless) if treated as an int. Converted these to double to match the format we return from rocprof-sdk. Co-authored-by: Benjamin Welton <[email protected]> Update limit for max counter records in rocprof-tool (#1073) A fixed sized std::array is used to store counter records in rocprofiler SDK. This limit was breached in SWDEV-484742. Upping the limit to 512 to be less likely to reach this limit again. adding proxy ompt_data_t * arguments fixes for proxy pointers - Implement proxy ompt_data_t* pointers for clients - Add ompt_data_t* arguments back to callback API - Modify openmp sample to illustrate use of proxy pointers formatting SWDEV-467350: Skipping tool counter iteration for unsupported hardware (#1083) Fixing some accumulate metrics (#1089) * Fixing some accumulate metrics * Fixing some more accumulate metrics --------- Co-authored-by: Benjamin Welton <[email protected]> updating rocprofv3 help options (#1113) * updating rocprofv3 help options * updating CHANGELOG Fixing installed pacakge tests in CI (#1119) * Fixing installed pacakge tests in CI * Formatted rocprofv3.py with black formatter SWDEV-488948: PC Sampling - Correlation class to provide some thread safety. Adding multithread tests. (#1112) * SWDEV-488948: PC Sampling - Correlation class to provide some thread safety. Adding multithread tests. * Update source/lib/rocprofiler-sdk/pc_sampling/parser/correlation.hpp Co-authored-by: Vladimir Indic <[email protected]> * Update source/lib/rocprofiler-sdk/pc_sampling/parser/correlation.hpp Co-authored-by: Vladimir Indic <[email protected]> * Adding backlog for codeobj changes * Formatting * Update source/lib/rocprofiler-sdk/pc_sampling/code_object.hpp Co-authored-by: Vladimir Indic <[email protected]> * Update source/lib/rocprofiler-sdk/pc_sampling/code_object.hpp Co-authored-by: Vladimir Indic <[email protected]> --------- Co-authored-by: Vladimir Indic <[email protected]> SWDEV-487621: Fixes for metric definitions (#1118) * Fixes for metric definitions * Removing gfx8 * Update changelog * Fixing unit tests * Small fixes * Fix for write size Fix PSDB change (#1120) Reverts change to `source/include/rocprofiler-sdk/callback_tracing.h` from commit 9b2ece7 clang-18 build fix for RCCL (#1123) Removes ambiguity on const usage, which clang-18 complains about (preventing build with warn error). mem copy direction field update (#1124) Adding Node-id for debugging with log level trace (#1090) fix botched rebase Per Jonathan to remove -rdynamic warning so CI will continue pedantic formatting Correct the package name of rocprofiler-sdk (#1126) * Correct the package name of rocprofiler-sdk ROCM VERSION(for ex: 60300) was missing in the package name. Added the same * Use cmake cache string while setting the variable for ROCm Version * correct the cmake-format --------- Co-authored-by: Ranjith Ramakrishnan <[email protected]> Fixing kokkosp tool library packaging (#1121) * Fixing kokkosp tool library packaging * Update source/lib/rocprofiler-sdk-tool/kokkosp/CMakeLists.txt Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update CMakeLists.txt * Update CMakeLists.txt * Component Requirement in CPack * Adding package dependency * Update CMakeLists.txt * Update rocprofiler_config_packaging.cmake * Fix rocprofiler-sdk-tool-kokkosp BUILD/INSTALL RPATH - CMAKE_INSTALL_LIBDIR doesn't help * Add BUILD/INSTALL RPATH to rocprofv3-trigger-list-metrics - fixes packaging issues * Update packaging - core depends on rocprofiler-sdk-roctx - add CPACK_DEBIAN_PACKAGE_SHLIBDEPS_PRIVATE_DIRS to resolve inter-package dependencies * Fix package depends version format * Improve tests/rocprofv3/summary/validate logging * Update CI workflow - prioritize roctx package in Install Packages step * Remove setting <package-name>_VERSION in config.cmake.in - this is automatically handled by existence of <package-name>-config-version.cmake * Update rocprofiler-sdk-config.cmake - relax find_package versioning requirements to same major and minor version * Update rocprofiler-sdk-config.cmake - relax find_package versioning requirements (remove EXACT, specify range) * Tweak CI workflow * Update perfetto_reader.py - better handle failure to load trace processor * Misc cleanup for config packaging * Update config packaging * Update config packaging * Revert perfetto for core-rpm packages * Revert perfetto for core-rpm packages - perfetto < 0.9.0 * Tweak tests/rocprofv3/summary/validate.py - reorder some checks --------- Co-authored-by: Ammar Elwazir <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <[email protected]> Clang Warning Fixes (#1131) Builds prevented on clang-18 Adding start and end timestamp columns in csv (#1128) * Adding start and end timestamp columns in csv * Adding assert check for the counter timestamps --------- Co-authored-by: Gopesh Bhardwaj <[email protected]> rocprofv3: docs and help menu updates (#1129) * doc updates * Correcting ROCtx information * Making ROCTx string consistent * missing occurence Renamed agent profiling service to device counting service (#1132) * Renamed agent profiling service to device counting service Name more aptly represents what agent profiling did (device wide counter collection). Conversion of existing user code can be performed by the following find/sed command: find . -type f -exec sed -i 's/rocprofiler_agent_profile_callback_t/rocprofiler_device_counting_service_callback_t/g; s/rocprofiler_configure_agent_profile_counting_service/rocprofiler_configure_device_counting_service/g; s/agent_profile.h/device_counting_service.h/g; s/rocprofiler_sample_agent_profile_counting_service/rocprofiler_sample_device_counting_service/g' {} + * Converted dispatch profile to dispatch counting service * Debug for functioal counters test * Minor changes for CI * Minor fix * More fixes for CI * Update evaluate_ast.cpp --------- Co-authored-by: Benjamin Welton <[email protected]> Testing updated RPM dockers (#1136) * Testing updated RPM dockers * Trying to fix PSDB for test package dependency Agent Profiling Fixes for Broken/Improper API Usage (#1122) Prevent's multiple setups of agent profiling on the same agent. Fixes agent read context to only read agents that were setup. Prevent copy of agent profiling internal data struct and reset hsa_signal on move to prevent inadvertant delete. Simplifying PR template (#1139) Implementation of ompt interface to rocprofiler SDK. task_create and task_schedule are not supported. Fixing installed pacakge tests in CI (#1119) * Fixing installed pacakge tests in CI * Formatted rocprofv3.py with black formatter Fix PSDB change (#1120) Reverts change to `source/include/rocprofiler-sdk/callback_tracing.h` from commit 9b2ece7 delete unused files added arguments to some OMPT buffter records * Fix cmake issues Remove rocprofiler_ompt_finalize_tool - a public API function is not necessary: should just finalize rocprofiler-sdk Fix duplicate ROCPROFILER_{BUFFER,CALLBACK}_TRACING_KIND_STRING Add lib/rocprofiler-sdk/ompt.hpp - declares rocprofiler::sdk::finalize_ompt Remove change to tests/rocprofv3/summary/conftest.py Add set_fini_status(1) back to registration.cpp Deleted uneeded files Incoporate OpenMP code and sample Fix merge issues with amd-staging Add push_correlation_id for OpenMP tasking; improve debugability fixup bad merge * Suppress OpenMP data race * Fix openmp_target sample * Enum and struct name changes + source code reorg - remove mix of ompt and openmp - opted for ompt - changes made for consistency - ompt_api -> ompt - openmp_api -> ompt - OPENMP -> OMPT * Update tests and more renaming - dest_device_num -> dst_device_num - src_addr -> src_address - dest_addr -> dst_address - remove info_type::begin - require OMP_TARGET_OFFLOAD * Update openmp-target test/sample env and labels * Formatting * Tweaks to cmake for openmp target - Disable for thread sanitizers due to preloading issue * OpenMP target cmake updates - remove gfx1010 (fails on mi300) - OPENMP_GPU_TARGETS * Remove device_unload and target_map_emi support - these are never supported by AMD OpenMP compilers * Update CI workflow - exclude openmp-target tests from navi3 and vega20 --------- Co-authored-by: Larry Meadows <[email protected]> Co-authored-by: Jonathan R. Madsen <[email protected]>
* SWDEV-492625: Track free memory HSA functions to help determine total amount of memory allocated on the system at any one time * Minor fixes to address comments * Update allocation size description * Moved get function back to specialization, minor typo fixes * Removed memory_operation_type field, removed memory_pool allocation enum, converted starting address to hex string for json format. * Made conversion to hex_string a function, changed address to use union rocprofiler_address_t type, changed VMEM descriptors * Removed as_hex from the global namespace * Formatting * Removed TRACK_EVENT for memory allocation, now TRACK_COUNTER for memory allocation is being performed * Check if address was recorded before retrieving allocation size in generate Perfetto * Formatting * Update source/lib/output/generatePerfetto.cpp * Explicitly disable app-abort tests * Remove excluding app-abort test from workflow CI - redundant bc these tests are explicitly marked as disabled now --------- Co-authored-by: Madsen, Jonathan <[email protected]> Co-authored-by: Jonathan R. Madsen <[email protected]>
…ation.cpp (#47) use v_rcp_f32 instead of v_fmac_f32
* Adding changes to register and read symbols from the hip fat binary * adding json output for host_functions * added error handling * adding json tool support * Adding tests * formatting changes * Adding documentation * refactoring as per amd-staging * Adding intializers and changing macros * Fix page-migration background thread on fork (ROCm#31) * Fix page-migration background thread on fork After falling off main in the forked child, all the children try to join on on the parent's monitoring thread. This results in a deadlock. Parent is waiting for the child to exit, but the child is trying to join the parent's thread which is signaled from the parent's static destructors. Even with just one parent and child, due to copy-on-write semantics, a child signalling the background thread to join will still block (thread's updated state is not visible in the child). This fix creates background treads on fork per-child with a pthread_atfork handler, ensuring that each child has its own monitoring thread. * Formatting fixes * Detach page-migration background thread and update test timeout * Attach files with ctest * Update corr-id assert * Tweak on-fork, simplify background thread * Revert thread detach * Adding --collection-period feature in rocprofv3 to match v1/v2 parity (ROCm#9) * Adding Trace Period feature to rocprofv3 * Adding feature documentation * Update source/bin/rocprofv3.py Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Fixing format * Moving to Collection Period and changing the input params * Format Fixes * Fixing rebasing issues * Removing atomic include from the tool * Adding more options for units, optimizing the code * Fixing rocprofv3.py * Fixing time conv & adding time controlled app * Fixing format * Changing to shared memory testing methodology * use of shmem use * Fix include headers for transpose-time-controlled.cpp * Format upload-image-to-github.py * Removing shmem and using only env var to dump timestamps from the tool * Tool Fixes + Test Config * Adding Tests * Fixing Review comments * Update trace period implementation * Update trace period tests * check between start and stop timestamps * Merge Fix * Update validate.py * Improve safety of rocprofiler_stop_context after finalization * Pass context id to collection_period_cntrl by value * Adding 20 us error margin * Ensure log level for collection-period test is not more than warning --------- Co-authored-by: Ammar ELWazir <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <[email protected]> * Update lib/rocprofiler-sdk/code_object/hip/code_object.* - move error code check macros to implementation - fix macros which check error code - use constexpr values instead of #define * Update lib/rocprofiler-sdk/code_object/hip/code_object.* - debugging for error that cannot be locally reproduced * Update lib/rocprofiler-sdk/code_object/hip/code_object.* - improve error handling and logging * Update lib/rocprofiler-sdk/code_object/hip/code_object.* - tweak to non-fatal logging messages * Update lib/rocprofiler-sdk/code_object/hip/code_object.* - cleanup of logging messages * Update host kernel symbol register data fields * Update source/lib/rocprofiler-sdk/code_object/hip/code_object.hpp --------- Co-authored-by: Madsen, Jonathan <[email protected]> Co-authored-by: Kuricheti, Mythreya <[email protected]> Co-authored-by: Elwazir, Ammar <[email protected]> Co-authored-by: Ammar ELWazir <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <[email protected]>
Temporarily disable sampled VM_IDs check
* Misc AFAR VII updates + clang-tidy-19 + bump version to 0.6.0 - move tests/rocprofv3/trace-period to tests/rocprofv3/collection-period - bump clang-tidy to v19 - fix misc clang-tidy errors * Update the collection period test - don't attach files on fail bc when test is disabled, it causes problems --------- Co-authored-by: Jonathan R. Madsen <[email protected]>
* PC Sampling API: emit info logs instead of error Inside PC sampling API, emit info logs instead of error logs. The tests verifies status code of each API call and decide when to skip, instead of relying on messages in logs. The samples_processing.cpp test has been removed as it's not used.
PC sampling must be explicitly enabled. Emit fatal error otherwise. Co-authored-by: Madsen, Jonathan <[email protected]> --------- Co-authored-by: Indic, Vladimir <[email protected]> Co-authored-by: Madsen, Jonathan <[email protected]>
* fix avail test * changing the regular expression * Adding fatal error to avail script * Revert "changing the regular expression" This reverts commit e522143b5d9dccb870fd7f5667619ed32687d1e6.
* Updates to counter collection optimizations * Fix logic error --------- Co-authored-by: Jonathan R. Madsen <[email protected]>
* Updating rocprofv3 doc for pc sampling beta option * Update source/docs/rocprofv3_input_schema.json * Update using-rocprofv3.rst --------- Co-authored-by: Elwazir, Ammar <[email protected]>
Co-authored-by: Jonathan R. Madsen <[email protected]>
…part of API call (#57) --------- Co-authored-by: Benjamin Welton <[email protected]> Co-authored-by: Benjamin Welton <[email protected]>
* reducing docs logging * Addressing review comments * exclude dirs * maximize NUM_PROC_THREADS * parallel build
* Deleting redundant action * Single reusable workflow for PSDB and OSDB * fixed calling psdb for mainline
* SWDEV-489158: Fix for exit thread safety * Fixed exit thread logic * Force CI to rerun * Remove .vscode * Fix thread safety bug * Addressed some comments * Formatting --------- Co-authored-by: Giovanni Baraldi <[email protected]>
* Changing Mi300 Names Making Mi300 names more specific: Adding multiple type to differentiate between Mi300X, Mi300A, Mi325X * Enable Mi300A PC Sampling testing
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR Details
What type of PR is this? (check all applicable)
Technical details
Please explain the changes along with JIRA/Github link(if applies).
libexec/ instead libexec
libexec/ instead sbin
use predefine flags instead of hardcoding the package name. etc.
Added/updated tests?
We encourage you to keep the code coverage percentage at 80% and above.
BAAS Test results - https://compute-artifactory.amd.com/artifactory/list/rocm-osdb-22.04-deb/compute-rocm-dkms-component-baas-475/
Updated CHANGELOG?
Needed for Release updates for a ROCm release.
Added/Updated documentation?