rocm-systems

Author	SHA1	Message	Date
Ioannis Assiouras	7313c3752a	SWDEV-567475 - Fix failures in graph tests due to GraphExec destroy h… (#1917 )	2025-11-22 23:01:47 +00:00
Julia Jiang	78a9d9ff70	[clr] SWDEV-566950 - Adding changelog for 7.2 (#1891 ) * [clr]SWDEV-566950 - Adding changelog for 7.2 * Update CHANGELOG.md * Update CHANGELOG.md	2025-11-19 09:10:14 -08:00
raramakr	eddd4c3601	SWDEV-505204 - Update libamdocl.so installation path to avoid exposing all ROCm libraries via ldconfig (#1914 ) ldconfig is run during rocm-opencl package installation. Installing libamdocl.so in /opt/rocm-xxx/lib exposes all ROCm libraries when /opt/rocm/lib is added to ldconfig. To prevent this, libamdocl.so is now installed in /opt/rocm-xxx/lib/opencl. ldconfig will use the updated path, limiting exposure to only libamdocl.so library. Co-authored-by: raramakr <raramakr@amd.com>	2025-11-19 21:14:28 +05:30
German Andryeyev	ff4782620e	SWDEV-547108 - Fix PAL build with HSA backend (#1850 ) When hip is built with HSA backend then the headers from ROCR will be used, but scratch_backing_memory_byte_size is a part of amd_queue_v2_t structure	2025-11-14 12:28:03 -05:00
Matt Arsenault	4830979f0e	SWDEV-548892 - Stop using ocml fma wrappers (#1702 ) Directly use elementwise builtin	2025-11-13 16:20:27 -08:00
Matt Arsenault	42e91b8934	SWDEV-548892 - Stop using ocml sqrt wrappers (#1716 )	2025-11-13 16:19:44 -08:00
Julia Jiang	5599e8b1de	SWDEV-561500 - Update change log and port 7.1.1 to develop branch (#1688 ) * SWDEV-561500 - Porting changelog(up to 7.1.1) to develop branch * Update CHANGELOG.md * Update CHANGELOG.md * Update CHANGELOG.md	2025-11-13 12:22:34 -08:00
pcritchl-amd	60cd210dac	Reapply "SWDEV-562996 - Build fix: Ubertrace callback calling convention mismatch on x86 (#1587 )" (#1717 ) (#1754 )	2025-11-12 13:47:24 -05:00
Ioannis Assiouras	4f91b68988	SWDEV-559166 - Remove obsolete member execInfoOffset from KernelParameters (#1790 )	2025-11-12 17:20:36 +00:00
Satyanvesh Dittakavi	07dd4c85e7	SWDEV-546308 - Implement hipKernelGetParamInfo API (#1783 )	2025-11-12 14:09:26 +05:30
jofrn	8f9da259ac	Fix memory leak in hip_fatbin.cpp UncompressAndPopulateCodeObject (#1692 ) Wrap amd_comgr_data_t item returned from action_data_get_data() in ComgrDataUniqueHandle to ensure it gets released.	2025-11-11 16:48:06 -05:00
systems-assistant[bot]	a66ca8809b	SWDEV-511239 - Remove `and` and use `&&` for preprocessors (#506 ) This shows up as warning in msvc. Co-authored-by: Jatin Chaudhary <JatinJaikishan.Chaudhary@amd.com>	2025-11-11 09:43:57 -08:00
Todd tiantuo Li	cf536a8c1a	SWDEV-554372 - Add 3 HIP_GET_PROC_ADDRESS_xxx flags (#1771 )	2025-11-10 23:29:40 -08:00
SaleelK	5e418ca256	clr: Allow all engines but prefer recommended engines (#1750 ) * Also honor ROC_P2P_SDMA_SIZE for IPC, since IPC can also mean P2P	2025-11-10 13:10:46 -08:00
Rakesh Roy	9cac2e46e4	SWDEV-565668 - Bump minor version for ROCm 7.2 (#1762 ) Additionally remove cmake option HIP_OFFICIAL_BUILD	2025-11-10 18:55:52 +05:30
Jin Jung	324a5519b9	SWDEV-563842 - Fix Memory Address Offset Bug (#1749 ) * SWDEV-563842 - Fix Memory Address Offset Bug * Revert "SWDEV-563842 - Fix Memory Address Offset Bug" This reverts commit 477958dc48300ee1fe0166aa6f0d3d8125b91f5e. * SWDEV-563842 - Fix Memcpy Address Offset Bug * SWDEV-563842 - Find Memcpy Device Address Offset * Revert "SWDEV-563842 - Find Memcpy Device Address Offset" This reverts commit 6c75a9e5b58b7dfabb9e3f91fa3dd892d42639cc. * Revert "SWDEV-563842 - Fix Memcpy Address Offset Bug" This reverts commit 0b89072a988074aa4da4e8fc7ba04c554f31ed44. * SWDEV-563842 - MemObjMap_ Offset Support This patch fixes the buffer offset handling bug. * Revert "SWDEV-563842 - MemObjMap_ Offset Support" This reverts commit 37fce3382465e3420721e5277377f943ec2b30a1. * SWDEV-563842 - External Memory Buffer View	2025-11-09 12:52:35 -08:00
Victor Zhang	7580052878	SWDEV-564318 - Add support for allocating uncached device memory (#1670 )	2025-11-09 12:51:41 -05:00
SaleelK	738bb19835	clr: Increase kernelArg/managedBuffer size (#1586 ) * Increase the buffer to 4MB. That can help kernel launches limited by a deep kernel pipeline Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>	2025-11-08 18:32:43 -08:00
Pengda Xie	93947241d0	SWDEV-556684 - HSAIL cleanup (#1657 )	2025-11-08 02:22:03 -08:00
Pengda Xie	5dd15e22ca	SWDEV-559514 - Add queue validation to submitMarker sync path (#1308 )	2025-11-08 02:21:36 -08:00
lancesix	f7ffcd1402	clr: SWDEV-547890 - Bump PAL API version to 954 (#1680 ) * clr: Adjust call to ICmdBuffer::CmdCopyMemoryToImage for PAL >= 955 PAL starting versino 955 adds a new argument to ICmdBuffer::CmdCopyMemoryToImage. Adjust teh callsite to account fort his. * clr: Handle new GpuUtil::TraceSessionState cases for PAL >= 939 Starting PAL API version 939, GpuUtil::TraceSessionState changes its possible values. Adjust for it. * clr: require PAL version 954 Bump the PAL required vesion to 954, as this is required for proper debugger support.	2025-11-08 00:52:04 +00:00
Jin Jung	291ff6c468	SWDEV-558855 - Enable Interop Map Buffer on Windows (#1748 ) * Support Windows HANDLE in interop_map_buffer * Refactored Windows HANDLE in interop_map_buffer * ROCr System Dependent Handle Type * Fix for ROCr Handle Conversion Bug * Remove Windows Header	2025-11-07 12:47:01 -08:00
Jimbo	2006a411e5	SWDEV-561611 - fix codeql errors by increasing printf buffer sizes (#1507 ) * SWDEV-561611 - fix codeql errors by increasing printf buffer sizes * Replace sprintf with snprintf to prevent potential buffer overflow --------- Co-authored-by: cadolphe-amd <chris.adolphe@amd.com>	2025-11-07 15:42:56 -05:00
marandje	0ad05ed515	SWDEV-556947 - Parse the HIP version from the Git tag (#1135 )	2025-11-06 10:18:26 +01:00
Satyanvesh Dittakavi	478cee0f68	SWDEV-559525 - Add the HIP_POINTER_ATTRIBUTE_IS_LEGACY_HIP_IPC_CAPABLE attribute support (#1647 ) * SWDEV-559525 - Add the HIP_POINTER_ATTRIBUTE_IS_LEGACY_HIP_IPC_CAPABLE attribute implementation * Update indentation in hip_memory.cpp	2025-11-06 12:07:32 +05:30
lancesix	280cda3196	clr: SWDEV-547890 - Maintain an MQD for the emulated AQL queue (#1669 ) * clr: SWDEV-547890 - Maintain an MQD for the emulated AQL queue To simplify the shader debugger implementation, maintain the relevant parts of the emulated AQL queue's MQD (amd_queue_t): read_dispatch_id, write_dispatch_id, compute_tmpring_size. With this MQD, the shader debugger can handle the emulated AQL queue the same way it does the real AQL queue, no specialization is required. * clr: SWDEV-547890 - Conservatively update the MQD's read_dispatch_id The read_dispatch_id cannot be smaller than the current aql_packet_id - hsa_queue.size for the debugger to work correctly. The read_dispatch_id really should be updated when the CmdBuf is marked as complete. Left a FIXME to address it in a future commit. --------- Co-authored-by: Laurent Morichetti <laurent.morichetti@amd.com>	2025-11-05 17:39:33 +00:00
Rakesh Roy	8797bb0150	Revert "SWDEV-562996 - Build fix: Ubertrace callback calling convention mismatch on x86 (#1587 )" (#1717 ) This reverts commit `8d31383dfe`. Reason for revert: It is breaking TheRock build on Windows	2025-11-05 11:48:02 -05:00
pcritchl-amd	8d31383dfe	SWDEV-562996 - Build fix: Ubertrace callback calling convention mismatch on x86 (#1587 ) Co-authored-by: Rakesh Roy <137397847+rakesroy@users.noreply.github.com>	2025-11-05 10:37:45 +05:30
Scott Todd	fdbafd7757	Revert "SWDEV-554372 - Add 3 HIP_GET_PROC_ADDRESS_xxx flags (#1057 )" (#1690 ) Reverts ROCm/rocm-systems#1057 Suspected of breaking the build, see https://github.com/ROCm/rocm-systems/pull/1057#issuecomment-3487715129 Logs: https://github.com/ROCm/rocm-systems/actions/runs/19062134668/job/54444052479#step:12:315 ``` [rocprofiler-sdk] FAILED: source/lib/rocprofiler-sdk/CMakeFiles/rocprofiler-sdk-object-library.dir/hip/abi.cpp.o [rocprofiler-sdk] ccache /opt/rh/gcc-toolset-12/root/usr/bin/c++ -DAMD_INTERNAL_BUILD=1 -DGLOG_USE_GLOG_EXPORT -DROCPROFILER_DL=1 -DROCPROFILER_HAS_GHC_LIB_FILESYSTEM=1 -DROCPROFILER_SDK_USE_SYSTEM_RCCL=0 -DROCPROFILER_SDK_USE_SYSTEM_ROCDECODE=0 -DROCPROFILER_SDK_USE_SYSTEM_ROCJPEG=0 -DUSE_PROF_API=1 -DYAML_CPP_STATIC_DEFINE -D__HIP_PLATFORM_AMD__=1 -Drocprofiler_EXPORTS=1 -I/__w/rocm-systems/rocm-systems/TheRock/build/profiler/rocprofiler-sdk/build/source/include -I/__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/source/include -I/__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/source -I/__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/external/yaml-cpp/include -I/__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/external/ptl/source -I/__w/rocm-systems/rocm-systems/TheRock/build/profiler/rocprofiler-sdk/build/external/ptl/source -isystem /__w/rocm-systems/rocm-systems/TheRock/build/core/clr/dist/include -isystem /__w/rocm-systems/rocm-systems/TheRock/build/core/ROCR-Runtime/dist/include -isystem /__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/external/filesystem/include -isystem /__w/rocm-systems/rocm-systems/TheRock/build/profiler/rocprofiler-sdk/build/external/glog -isystem /__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/external/glog/src -isystem /__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/external/fmt/include -isystem /__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/external/elfio -isystem /__w/rocm-systems/rocm-systems/TheRock/build/compiler/amd-comgr-stub/dist/include -isystem /__w/rocm-systems/rocm-systems/TheRock/build/third-party/sysdeps/linux/libdrm/build/stage/lib/rocm_sysdeps/lib/pkgconfig/../../include -isystem /__w/rocm-systems/rocm-systems/TheRock/build/third-party/sysdeps/linux/libdrm/build/stage/lib/rocm_sysdeps/lib/pkgconfig/../../include/libdrm -isystem /__w/rocm-systems/rocm-systems/TheRock/build/third-party/sysdeps/linux/elfutils/build/dist/lib/rocm_sysdeps/include -O3 -DNDEBUG -std=c++17 -fPIC -fvisibility=hidden -fvisibility-inlines-hidden -W -Wall -Wno-unknown-pragmas -faligned-new -rdynamic -fstack-protector-strong -Wstack-protector -MD -MT source/lib/rocprofiler-sdk/CMakeFiles/rocprofiler-sdk-object-library.dir/hip/abi.cpp.o -MF source/lib/rocprofiler-sdk/CMakeFiles/rocprofiler-sdk-object-library.dir/hip/abi.cpp.o.d -o source/lib/rocprofiler-sdk/CMakeFiles/rocprofiler-sdk-object-library.dir/hip/abi.cpp.o -c /__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/source/lib/rocprofiler-sdk/hip/abi.cpp [rocprofiler-sdk] In file included from /__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/source/lib/rocprofiler-sdk/hip/abi.cpp:26: [rocprofiler-sdk] /__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/source/lib/common/abi.hpp:62:27: error: static assertion failed: size of the API table struct has changed. Update the STEP_VERSION number (or in rare cases, the MAJOR_VERSION number) [rocprofiler-sdk] 62 \| sizeof(TABLE) == ::rocprofiler::common::abi::compute_table_offset(NUM), \ ```	2025-11-04 14:29:58 -08:00
Sam Ruscica	757de39caa	Updated amdFileRead/Write in rocdevice to support windows build (#1435 ) * Updated amdFileRead in rocdevice to support windows build * Updated amdFileRead in rocdevice to support windows build	2025-11-04 10:03:03 -05:00
Todd tiantuo Li	7573fa168d	SWDEV-554372 - Add 3 HIP_GET_PROC_ADDRESS_xxx flags (#1057 )	2025-11-04 00:16:12 -08:00
MachineTom	fb006546d0	SWDEV-1 - Fix a typo (#1615 ) * SWDEV-1 - Fix a typo Fix a typo. Remove unnecessary log. * Removing patch --------- Co-authored-by: geomin12 <geomin12@amd.com> Co-authored-by: Scott Todd <scott.todd0@gmail.com>	2025-11-03 12:59:00 -08:00
Ajay GunaShekar	d998a5280a	Revert "clr: SWDEV-547890 - Maintain an MQD for the emulated AQL queue (#1316 )" (#1654 ) This reverts commit `f5bbb09c0d`. windows build failure and requires PAL update	2025-11-03 08:17:26 -08:00
lmoriche	f5bbb09c0d	clr: SWDEV-547890 - Maintain an MQD for the emulated AQL queue (#1316 ) * clr: SWDEV-547890 - Maintain an MQD for the emulated AQL queue To simplify the shader debugger implementation, maintain the relevant parts of the emulated AQL queue's MQD (amd_queue_t): read_dispatch_id, write_dispatch_id, compute_tmpring_size. With this MQD, the shader debugger can handle the emulated AQL queue the same way it does the real AQL queue, no specialization is required. * clr: SWDEV-547890 - Conservatively update the MQD's read_dispatch_id The read_dispatch_id cannot be smaller than the current aql_packet_id - hsa_queue.size for the debugger to work correctly. The read_dispatch_id really should be updated when the CmdBuf is marked as complete. Left a FIXME to address it in a future commit.	2025-10-31 16:07:02 -04:00
Satyanvesh Dittakavi	f332888366	SWDEV-560304 - Fix segfault with invalid stream (#1360 )	2025-11-01 00:04:44 +05:30
Jaydeep	10763f0e7a	SWDEV-559505 - Enable back memset optimization and handle the cases when setParam can change the number of AQL packets for memset graph node. (#1320 ) Co-authored-by: jaydeeppatel1111 <jaypatel@amd.com>	2025-10-31 22:49:14 +05:30
Ioannis Assiouras	1dd0237cb2	SWDEV-563752 - Allow hipMemLocationTypeHost in hipMemSetAccess even if memory was created on the device (#1620 ) Co-authored-by: Rahul Manocha <rmanocha@amd.com>	2025-10-31 13:57:36 +00:00
dsicarov-amd	4915496bf9	SWDEV-533237 Add hipOccupancyAvailableDynamicSMemPerBlock API (#899 ) * SWDEV-533237 Add initial support for hipOccupancyAvailableDynamicSMemPerBlock API * SWDEV-533237 Add hipOccupancyAvailableDynamicSMemPerBlock wrapper for nvidia * SWDEV-533237 Add implementation of hipOccupancyAvailableDynamicSMemPerBlock API * SWDEV-533237 Add LDSAlignment field in Isa table --------- Co-authored-by: Rahul Manocha <rmanocha@amd.com>	2025-10-29 10:58:42 +01:00
Ajay GunaShekar	f8e3858659	remove usage of HIP_RETURN in internal function (#1359 )	2025-10-27 15:37:46 -07:00
Rahul Manocha	f5d901f016	SWDEV-546311 - implement hipKernelGetLibrary & hipLibraryEnumerateKer… (#1143 ) * SWDEV-546311 - implement hipKernelGetLibrary & hipLibraryEnumerateKernels API * Fix for LibraryEnumerateKernel and KernelGetName * Update Enumerate Kernels to handle 0 numKernels * Minor fixes to function names * fix error checking in internal function * Update changelog for new apis --------- Co-authored-by: Rahul Manocha <rmanocha@amd.com>	2025-10-27 14:13:17 -07:00
Shadi Dashmiz	3e59eebf17	SWDEV-558510:Correct max mem per multiprocessor value (#1207 ) Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>	2025-10-27 15:45:06 -04:00
MachineTom	eb69a455ed	SWDEV-558844 - Cleanup Os header (#1530 ) Remove codes that aren't used in Os header.	2025-10-27 11:52:31 -04:00
SaleelK	f301053740	clr: Improve logging (#1457 )	2025-10-25 15:55:27 -07:00
Rakesh Roy	e9dac39102	SWDEV-560065 - Revert changes to align error code with Cuda when stream capture is tried on Legacy stream (#1337 ) * SWDEV-560065 - Revert "SWDEV-555484 - Invalidate capturing stream only for null/legacy stream. (#1032)" This reverts commit `99613f1009`. * SWDEV-560065 - Revert "SWDEV-542700 - Return an error if stream capture is attempted on the null stream while a stream capture is active. (#450)" This reverts commit `0647cf1d28`.	2025-10-24 21:33:25 +05:30
Rahul Manocha	4f075902fc	SWDEV-555347 - Remove lock contention in async events loop (#878 ) * SWDEV-555347 - Remove lock contention in async events loop * SWDEV-555347 - Introduce Pool of AsyncEventItems * create generic mempool for AsyncEventItem * Use BaseShared allocate and free for async event pool --------- Co-authored-by: Rahul Manocha <rmanocha@amd.com>	2025-10-24 08:43:00 -07:00
Jatin Chaudhary	48313b8655	SWDEV-1 add missing hiperror entries (#1450 )	2025-10-24 09:29:27 +01:00
SaleelK	839fb95717	clr: Do not increase signal pool (#1354 ) * Do not increase signal pool when profiling, instead allow saving off timestamps. This is slow but a tradeoff to memory footprint of the signals	2025-10-23 22:05:00 -07:00
MachineTom	5f76cb916d	SWDEV-555888 - Refactor Numa code (#1191 ) 1. Create a set of mini numa interface. In Linux, the interface is based on system call rather than libnuma. In Windows, the interface can also work, but the policy class is dummy. Different from Linux, Windows doesn't provide numactl tool or numa lib to setup numa policy, thus the default policy is followed in Windows, that is, using the closest host numa node to allocate pinned host memory in hipHostMalloc(). To get the closest host numa node of a GPU device, you need query the new attribute hipDeviceAttributeHostNumaId. Then you can create a thread with CPU affinity on the numa node. For example, reference the test in hip-tests/catch/perftests/memory/hipPerfHostNumaAllocWin.cc. 2. Remove pfnSetThreadGroupAffinity and pfnGetNumaNodeProcessorMaskEx as the functions have been exposed since Win7 and Win server 2008. 3. Other minor fixes.	2025-10-23 21:56:15 -04:00
Ioannis Assiouras	602ea0be1e	SWDEV-558078 - Fix use-after-free in graph tests due to AsyncEventHandler (#1502 )	2025-10-23 22:49:24 +01:00
Pengda Xie	a4bbd73dc6	SWDEV-556684 - Remove HSAIL support (#1183 )	2025-10-23 11:21:49 -07:00

1 2 3 4 5 ...

13250 Commits