rocm-systems

작성자	SHA1	메시지	날짜
Mythreya Kuricheti	3e5749ea59	[rocprofiler-sdk][CI] Update codeql rocm version (#1909 ) * [rocprofiler-sdk][CI] Update codeql rocm version * Build HIP from source	2025-12-02 12:17:59 -06:00
Wenkai Du	3e650467fa	Use one side stream per process (#2063 ) * Use one side stream per process * Handle multiple GPUs per process * Reset stream when not found * Address review comments * Fix missing mutex initializer [ROCm/rccl commit: `185e78a8f0`]	2025-12-02 10:03:15 -08:00
Wenkai Du	185e78a8f0	Use one side stream per process (#2063 ) * Use one side stream per process * Handle multiple GPUs per process * Reset stream when not found * Address review comments * Fix missing mutex initializer	2025-12-02 10:03:15 -08:00
Yiltan	0f32739b52	Updated important missing enviroment variables (#344 ) [ROCm/rocshmem commit: `8b350a51fe`]	2025-12-02 11:40:30 -05:00
Yiltan	8b350a51fe	Updated important missing enviroment variables (#344 )	2025-12-02 11:40:30 -05:00
Aurelien Bouteiller	1e3a161c74	MLX5 cards have a vendor-id that does not match the pci-vendor-id for (#342 ) some reason. Signed-off-by: Aurelien Bouteiller <abouteil@amd.com> [ROCm/rocshmem commit: `0f7da76018`]	2025-12-02 11:32:37 -05:00
Aurelien Bouteiller	0f7da76018	MLX5 cards have a vendor-id that does not match the pci-vendor-id for (#342 ) some reason. Signed-off-by: Aurelien Bouteiller <abouteil@amd.com>	2025-12-02 11:32:37 -05:00
Jin Jung	3e25decb39	SWDEV-565719 - Enable GL Interop Tests on Windows (#1808 ) * SWDEV-565719 - Enable GL Interop Tests on Windows * SWDEV-565719 - Define USE_EGL feature * Update projects/hip-tests/catch/unit/gl_interop/gl_interop_common.hh Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * SWDEV-565719 - Add glewInit * SWDEV-565719 - Remove incorrect glutExit() * Refactor GL interop build configuration --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>	2025-12-02 08:19:07 -08:00
Alysa Liu	3a7b5571c0	kfdtest: Replace pthread with std::thread (#1448 ) * kfdtest: Replace pthread with std::thread Modify concurrent kfdtest to use std::thread instead of pthread, eventually modify KFDTestLaunch to take in a member function of test instance instead of static function. Convert KFDQMTest to pass in member function for multi-gpu kfdtest. * kfdtest: Convert KFDPerfCountersTest to use std::thread Convert KFDPerfCountersTest to use std::thread for multi-gpu kfdtest. * kfdtest: Convert KFDGraphicsInterop to use std::thread Convert KFDGraphicsInterop to use std::thread for multi-gpu kfdtest. * kfdtest: Convert KFDGWSTest to use std::thread Convert KFDGWSTest to use std::thread for multi-gpu kfdtest. * kfdtest: Convert KFDCWSRTest to use std::thread Convert KFDCWSRTest to use std::thread for multi-gpu kfdtest. * kfdtest: Convert KFDEventTest to use std::thread Convert KFDEventTest to use std::thread for multi-gpu kfdtest. * kfdtest: Convert KFDExceptionTest to use std::thread Convert KFDExceptionTest to use std::thread for multi-gpu kfdtest. * kfdtest: Convert KFDLocalMemoryTest to use std::thread Convert KFDLocalMemoryTest to use std::thread for multi-gpu kfdtest. * kfdtest: Convert KFDMemoryTest to use std::thread Convert KFDMemoryTest to use std::thread for multi-gpu kfdtest. * kfdtest: Convert KFDSVMRangeTest to use std::thread Convert KFDSVMRangeTest to use std::thread for multi-gpu kfdtest. * kfdtest: Convert KFDHWSTest to use std::thread Convert KFDHWSTest to use std::thread for multi-gpu kfdtest. * kfdtest: Remove pthread multigpu test structure Remove older multi-gpu test framework which uses pthread.	2025-12-02 10:25:21 -05:00
Alysa Liu	81df45d896	rocrtst: Add test for filter ROCR_VISIBLE_DEVICES (#2016 ) Improve test coverage for amd_filter_device.cpp.	2025-12-02 10:15:03 -05:00
vedithal-amd	5bbb72f516	Add JIRA ID to pull request template (#2099 )	2025-12-02 10:14:27 -05:00
lancesix	7ba0f1bdcf	ROCr-runtime: Allow core dumps for gfx115x (#1827 ) Core dumps are not supporetd for gfx110x, but should be possible for gfx115x. The current code disables core dumps completly for all gfx11xx agents, relax this to allow gfx115x.	2025-12-02 11:51:29 +00:00
Jaydeep	cf91f5f77a	SWDEV-568926 - Destroy queue_inactive_signal directly. (#2059 )	2025-12-02 09:55:58 +05:30
Ioannis Assiouras	65b769ee16	SWDEV-569101 - increase signal list size to at least DEBUG_HIP_GRAPH_BATCH_SIZE (#2084 )	2025-12-01 18:52:51 -08:00
SaleelK	c105dcd05b	clr: Use graph segment scheduling to process HIP Graphs (#1372 ) * clr: Use graph segment scheduling to process HIP Graphs * Add a broader path to use capture packet capture for all topologies * Refactor code * Use DEBUG_HIP_GRAPH_SEGMENT_SCHEDULING to toggle new vs classic path, Enabled by default * clr: Few fixes and improvements * clr: Detect complex graphs to take classic path * Use DEBUG_HIP_GRAPH_SEGMENT_SCHEDULING=2 to force segment scheduling path * clr: Fix a cornercase stack corruption * clr: Track commands of segments instead of snapshots * clr: Fix Batch dispatch logic * Track fence_dirty_ flag for command of other streams * Dependency resolution markers can now accomodate dirty fence on cross streams --------- Co-authored-by: Ioannis Assiouras <Ioannis.Assiouras@amd.com> Co-authored-by: Godavarthy Surya, Anusha <agodavar@amd.com>	2025-12-01 12:49:26 -08:00
Kutovoi, Vadim	dde9e2464e	gda: add check for active interfaces when selecting the GDA backend (#327 ) * gda: add check for active interfaces when selecting the GDA backend * fix __func__ maco in rocshmem_ctx_pe_quiet * gda: switch to more generic RDMA NIC term in has_active_ib_interface * gda: add active MLX5 and Pensando vendor ID checks for backend selection [ROCm/rocshmem commit: `29000a5644`]	2025-12-01 15:49:25 -05:00
Kutovoi, Vadim	29000a5644	gda: add check for active interfaces when selecting the GDA backend (#327 ) * gda: add check for active interfaces when selecting the GDA backend * fix __func__ maco in rocshmem_ctx_pe_quiet * gda: switch to more generic RDMA NIC term in has_active_ib_interface * gda: add active MLX5 and Pensando vendor ID checks for backend selection	2025-12-01 15:49:25 -05:00
Bindhiya Kanangot Balakrishnan	a627c12501	[SWDEV-566465] Fix json output for amdsmi reset (#2043 ) Fixed json output for reset command. Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>	2025-12-01 13:30:32 -06:00
Adel Johar	2c243feb1b	[Docs] Move environment variables to separate page (#341 ) [ROCm/rocshmem commit: `ba77bdd9a6`]	2025-12-01 14:25:27 -05:00
Adel Johar	ba77bdd9a6	[Docs] Move environment variables to separate page (#341 )	2025-12-01 14:25:27 -05:00
dependabot[bot]	2f96618210	Bump rocm-docs-core[api_reference] from 1.30.0 to 1.30.1 in /docs/sphinx (#209 ) Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.30.0 to 1.30.1. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.30.0...v1.30.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-version: 1.30.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> [ROCm/rocjpeg commit: `a1f8e1e3b5`]	2025-12-01 09:56:25 -08:00
dependabot[bot]	a1f8e1e3b5	Bump rocm-docs-core[api_reference] from 1.30.0 to 1.30.1 in /docs/sphinx (#209 ) Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.30.0 to 1.30.1. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.30.0...v1.30.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-version: 1.30.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-01 09:56:25 -08:00
cfallows-amd	29a7591791	Update RHEL8/9 workflow with latest rocm 7.1.1 links (#2060 ) Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>	2025-12-01 11:14:33 -05:00
Yiltan	77ba8cc76c	Cleanup readme.md [ROCm/rocshmem commit: `774159a08f`]	2025-12-01 10:43:18 -05:00
Yiltan	774159a08f	Cleanup readme.md	2025-12-01 10:43:18 -05:00
marantic-amd	3b11e01716	Perfetto traces from cached data (#1704 ) ## Motivation The idea is to unify the way and place where we store our traces. Current implementation uses `trace_cache` for rocpd traces, but perfetto is in lined inside of each module. This change allows us to have a single point in code where we will collect data, process it and store it in the desired format. This means that we can declutter the code further and have single point of responsibility and single point of failure. ## Technical Details New `processor` (perfetto_post_processing.cpp) is added to the `trace_cache` which purpose is to use the cached data to populate perfetto tracks. Cache manager is responsible for keeping the instance of this processor and for its lifetime.	2025-12-01 09:59:16 -05:00
Kian Cossettini	b506c75f28	[rocprof-sys] Fix roctx wall clock tree, change timemory push/pop to use proper category, and add roctx as valid domain choice (#2062 ) When doing this ticket, I also noticed the program would SEGFAULT when ROCPROFSYS_ROCM_DOMAINS=roctx even though the docs tell us we can do this. Went ahead and fixed that. Also noticed that timemory push/pop in rocprofiler-sdk.cpp was always using category::rocm_marker_api instead of CategoryT. Fixed that as well.	2025-12-01 09:50:58 -05:00
Yiltan	2079193495	Update docs for GDA (#337 ) [ROCm/rocshmem commit: `5606fdafd6`]	2025-12-01 09:38:11 -05:00
Yiltan	5606fdafd6	Update docs for GDA (#337 )	2025-12-01 09:38:11 -05:00
Kian Cossettini	ae29018bb0	[rocprofiler-systems] Enable HOST OMPVV runtime-instrumentation CTests (#1970 ) * Enable HOST ompvv runtime-instrumentation ctests * Fix rocprofiler-systems-avail-regex-negation test failure * Exclude problematic function from instrumentation * Make push pop skip an env option for ctests * Remove SKIP_PUSH_POP_CHECK from argument parse Co-authored-by: David Galiffi <David.Galiffi@amd.com> --------- Co-authored-by: David Galiffi <David.Galiffi@amd.com>	2025-12-01 09:26:24 -05:00
vstojilj	77f58ceb9f	SWDEV-558557 - Remove duplicate nodes when capturing hipMemcpyAsync (#1226 )	2025-12-01 11:25:13 +01:00
Horatio Zhang	5983ccefa5	librocdxg: Return DXG fd in amdgpu_device_get_fd Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Flora Cui <flora.cui@amd.com>	2025-12-01 14:29:20 +08:00
Horatio Zhang	6bb933f820	librocdxg: Add drm metadata related interface Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Flora Cui <flora.cui@amd.com>	2025-12-01 14:29:13 +08:00
Milan Radosavljevic	ee7305e795	[rocprof-sys] Add test cleanup fixtures for binary-rewrite and runtime-instrument tests (#2012 ) - Added `binary-rewrite-cleanup` and `runtime-instrument-cleanup` tests that remove instrumented binaries and output directories using `cmake -E rm -rf` - Implemented CMake test fixtures (`FIXTURES_SETUP` and `FIXTURES_CLEANUP`) to establish proper test ordering: - `binary-rewrite` sets up the `binary-rewrite-fixture` - `binary-rewrite-run` and validation tests require this fixture - `binary-rewrite-cleanup` performs cleanup for this fixture - Same pattern applied for `runtime-instrument` - Extended `ROCPROFILER_SYSTEMS_ADD_PYTHON_TEST` to accept `FIXTURES_REQUIRED` parameter - Updated validation tests to require appropriate cleanup fixtures based on test name pattern matching - Added fixture requirements to Python code-coverage tests	2025-11-28 18:51:54 -05:00
abchoudh-amd	fd61b0f507	Add CU Utilization and deprecate Active CUs (#1822 ) * ChangeLog * Deprecation notice in old arch * Deprecation notice current arch * New config hash * Added Config deltas * Added metric description	2025-11-28 11:32:25 -05:00
vedithal-amd	3f2fbc18e9	[rocprofiler-compute] Only depend on amdsmi in profile phase (#2044 ) * Only depepnd on amdsmi in profile phase * amdsmi interface tests should have common prefix for easier testing	2025-11-28 11:32:00 -05:00
Flora Cui	0761dd0146	librocdxg: Increase AQL frame size calculation to prevent PM4 command buffer overflow Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Longlong Yao <Longlong.Yao@amd.com> Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/113>	2025-11-28 14:53:07 +08:00
Flora Cui	437e4b092e	librocdxg: Convert all CmdUtil methods to static Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Longlong Yao <Longlong.Yao@amd.com> Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/113>	2025-11-28 14:52:56 +08:00
Honglei Huang	68c8e111ae	rocr/format: Fix clang-format-diff.py path in format script (#1757 ) Update the format script to use absolute path for clang-format-diff.py instead of relative path. This ensures the script works correctly regardless of the current working directory when executed. - Change from './clang-format-diff.py' to '${root}/projects/rocr-runtime/clang-format-diff.py' - Improves script reliability and portability Signed-off-by: Honglei Huang <honghuan@amd.com>	2025-11-28 09:21:12 +08:00
Honglei Huang	aaa06e1609	libhsakmt/virtio: add non SVM mode in libhsakmt virtio driver and many fixes (#1756 ) * libhsakmt/virtio: change shmem size to 80 Some DGPU props have a lot of information, so it is necessary to increase the size of shmem. Signed-off-by: Honglei Huang <honghuan@amd.com> * libhsakmt/virtio: use BO handle instead of pointer in memory registration Change vhsakmt_map_to_gpu() return type from void* to vhsakmt_bo_handle to properly handle buffer object information. This allows access to both the host address and resource ID needed for memory registration. Signed-off-by: Honglei Huang <honghuan@amd.com> * libhsakmt/virtio: Improve memory mapping logic - Update vhsakmt_mappable() to check NoAddress flag and require HostAccess - Remove mappable checks in cpu_map/unmap to allow all BOs to be mapped - Set BO flags properly in vhsakmt_alloc_memory and scratch memory creation - Ensure scratch memory is correctly flagged for proper handling Signed-off-by: Honglei Huang <honghuan@amd.com> * libhsakmt/virtio: add no svm mode for libhsakmt virtio Add no svm mode for libhsakmt virtio driver, in no svm mode userptrs need UMD to manage, so add interval tree to manage them. New Features: - Add augmented red-black tree based interval tree implementation * Implement RB-tree insertion, deletion, and color balancing * Provide interval query for fast overlapping range lookup * Based on Linux kernel's augmented rbtree implementation - Improve userptr memory management * Use interval tree to efficiently track userptr memory regions * Support finding registered memory within given address ranges * Optimize memory mapping and unmapping performance Signed-off-by: Honglei Huang <honghuan@amd.com> --------- Signed-off-by: Honglei Huang <honghuan@amd.com>	2025-11-28 09:20:43 +08:00
Pratik Basyal	792ecc1a83	Formatting fixed (#1691 )	2025-11-27 18:55:45 -05:00
corey-derochie-amd	8e3f60e080	Add copyright to src/device/symmetric/all_reduce.cuh (#2080 ) [ROCm/rccl commit: `4acd0f64ea`]	2025-11-27 14:29:21 -07:00
corey-derochie-amd	4acd0f64ea	Add copyright to src/device/symmetric/all_reduce.cuh (#2080 )	2025-11-27 14:29:21 -07:00
Giovanni Lenzi Baraldi	0e04fdd571	Workaround for SWDEV-559598. Enabling more thread trace tests. (#1336 ) * Workaround for SWDEV-559598 * gfx11 fix	2025-11-27 20:03:38 +01:00
vstojilj	259010f2d5	SWDEV-491253 - Create stream capture test for kernel APIs (#1189 )	2025-11-27 17:40:11 +01:00
vstojilj	1c09c87cc7	SWDEV-564927 - Allow sizeBytes to be 0 when hipMemsetAsync is captured (#1849 )	2025-11-27 17:13:33 +01:00
Kian Cossettini	63713f01e0	[rocprofiler-systems] Add Fortran MPI CTests (#1172 ) * Add MPI CTests (use gfortran) * Add proper regex check * Skip Runtime-Instrument due to incompatibility with MPI Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Sajina Kandy <sputhala@amd.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-11-27 10:32:09 -05:00
vedithal-amd	2e10041210	Fix sL1D values in memory chart (#2037 )	2025-11-27 09:13:19 -05:00
Godavarthy Surya, Anusha	2e1c37a926	SWDEV-490861 - Remove recursion and extra loop in hipGraphLaunch (#1792 )	2025-11-27 10:25:08 +00:00
Ammar ELWazir	ed42157c31	Fixing Code Object Data Race and Thread Safety & Adding validation test (#2014 )	2025-11-26 20:28:52 -06:00

... 13 14 15 16 17 ...

76333 커밋