Matt Arsenault
0c0d8dc974
SWDEV-548892 - Stop using __ockl_lane_id ( #2186 )
...
__lane_id already exists and is identical.
2025-12-19 20:34:55 +01:00
Sourabh U Betigeri
883fdfb820
Revert "clr: Minor fixes for error return" ( #2399 )
...
- This reverts commit 8dd8436e43c7f0d062fd73252bf61c35615d181d.
- Resolve MIOpen test failures observed in TheRock
- TheRock Issue: ROCm/TheRock#2642
- room-systems issue: #2400
2025-12-18 18:40:13 -05:00
Jatin Chaudhary
fdf73116d5
Do not allocate code objects when we map a static code object ( #2332 )
2025-12-18 09:22:02 +00:00
Maneesh Gupta
4a9833e70e
Revert "Add HasExpertSchedMode device prop ( #2241 )" ( #2371 )
...
This reverts commit c0b4aef5ad .
2025-12-17 21:26:44 -08:00
Shadi Dashmiz
96f6b6e251
SWDEV-571304 : Fix the constructor for __half ( #2240 )
...
- comply with cuda
- Fix usecase for constexpr
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com >
2025-12-17 11:15:20 -05:00
Filip Jankovic
c0b4aef5ad
Add HasExpertSchedMode device prop ( #2241 )
...
* Add HasExpertSchedMode device prop
* Add unit tests for HasExpertSchedMode
* Add gfx12 check for HasExpertSchedMode prop
* Update gfx major version check and test for ExpertSchedMode
* Minor fix and ROCr version bump
* Update projects/rocr-runtime/runtime/hsa-runtime/inc/hsa_ext_amd.h
* Update projects/rocr-runtime/runtime/hsa-runtime/inc/hsa_ext_amd.h
* Apply suggestion from @dayatsin-amd
* Apply suggestion from @dayatsin-amd
---------
Co-authored-by: Stefan Sokolovic <stefan.sokolovic2@amd.com >
Co-authored-by: David Yat Sin <77975354+dayatsin-amd@users.noreply.github.com >
2025-12-17 17:06:08 +01:00
randyh62
1240b592a5
Git url fix ( #2285 )
...
* Update README-doc.md
Correct GitHub URL for components moved into rocm-systems
* Update amd_clr.rst
Update github.com URLs
* Update Dockerfile
Update rocm-systems paths
* Update CONTRIBUTING.md
update for rocm-systems
* Update CONTRIBUTING.md
minor change
* Update CONTRIBUTING.md
* Update CONTRIBUTING.md
* Update hip_runtime_api.rst
Update for rocm-systems
* Update installation.rst
update URL to libhsakmt
* Update what_is_hip.rst
* Update projects/clr/CONTRIBUTING.md
Co-authored-by: Dominic Widdows <dwiddows@gmail.com >
* Update projects/clr/README-doc.md
Co-authored-by: Dominic Widdows <dwiddows@gmail.com >
* Update Dockerfile
Update git clone for sparse checkout
* Update projects/hip/CONTRIBUTING.md
* Update projects/clr/CONTRIBUTING.md
* Update projects/hipother/CONTRIBUTING.md
---------
Co-authored-by: Dominic Widdows <dwiddows@gmail.com >
2025-12-15 11:57:18 -08:00
systems-assistant[bot]
b002c6a739
SWDEV-538607 - Add SIMDe as a build dependency, remove naked intrinsic use. ( #500 )
...
Co-authored-by: Alex Voicu <alexandru.voicu@amd.com >
Co-authored-by: Ioannis Assiouras <Ioannis.Assiouras@amd.com >
2025-12-15 17:40:51 +00:00
Matt Arsenault
49565f9d9f
SWDEV-548892 - Always declare used ocml and ockl device libs functions ( #2230 )
...
Ignore __CLANG_HIP_RUNTIME_WRAPPER_INCLUDED__. This should not be relying
on declarations from the clang builtin headers. There is no issue declaring
the same intrinsics multiple times. This will enable removal of declarations
from the clang builtin headers.
2025-12-15 17:23:33 +01:00
Fábio Mestre
447beeb00b
Replace usages of __ockl_gws_init with __builtin_amdgcn_ds_gws_init ( #2235 )
2025-12-15 16:56:14 +01:00
Dominic Widdows
9a8ed9f45d
Doc updates updating internal links from deprecated repos to rocm-systems project locations ( #2294 )
...
* Update README documentation links for clarity and consistency across projects
- Changed links in the README files for `clr`, `hipother`, and `hip-tests` to use relative paths instead of absolute URLs, improving navigation within the repository.
* Update CONTRIBUTING documentation to use relative links for improved navigation
- Changed absolute URLs to relative paths in the CONTRIBUTING.md files for the hip and hipother projects, enhancing consistency and ease of access within the repository.
2025-12-12 13:21:42 -08:00
SaleelK
840301e12d
clr: Minor fixes for error return ( #2153 )
2025-12-11 16:59:56 -08:00
SaleelK
10635483ad
clr: Fix packet batch write logic ( #2236 )
...
* When writing bulk packets always invalidate packet headers, Its
possible that the CP fetcher can have multiple packets in flight. In
such cases we may end up with a malformed packet because the writes are
not complete yet CP finds a valid header.
2025-12-11 04:26:41 -08:00
Matt Arsenault
a495d1137e
SWDEV-548892 - Make declaration of __ockl_fdot2 always available ( #2229 )
2025-12-11 11:53:11 +01:00
German Andryeyev
3895aadba6
SWDEV-558849 - Make ROCR path in Windows more stable ( #2181 )
2025-12-10 12:37:10 -05:00
Pengda Xie
1d6b26f829
SWDEV-556684 - HSAIL Cleanup re-apply commit 4abdfe5: ( #2024 )
...
Removed some options
-xnack, -force-wgp-mode, -force-wave-size-32, -round-trip-spirv,
-fe-gen-spirv, -lower-pipe-builtins=0|1, -lower-atomics=0|1,
-set-lds=<value>, -set-scalar-registers=<value>,
-set-vector-registers=<value>, -limit-scalar-registers=<value>,
-limit-vector-registers=<value>, -sc-xnack-iommu,
-faa-for-barrier/-fno-a-for-barrier, -sc-dev-format, -verify-lwspir,
-verify-hwspir, -ffma-enable/-fno-fma-enable,
-fmad-enable/-fno-mad-enable, -fdisable-avx/-fno-disable-avx,
-fforce-llvm/-fno-force-llvm, -print-compile-phases,
-kernel-cache-enforce-miss, -kernel-cache-wipe, -kernel-cache,
-sc[=<filename>]/--load-sc-dll[=<filename>],
-be[=<filename>]/--load-be-dll[=<filename>],
-cg[=<filename>]/--load-cg-dll[=<filename>],
-link[=<filename>]/--load-link-dll[=<filename>],
-opt[=<filename>]/--load-opt-dll[=<filename>],
-fe[=<filename>]/--load-fe-dll[=<filename>],
-cl[=<filename>]/--load-cl-dll[=<filename>], -just-kernel=<kernel-name>,
-use-debugil, -fmulti-level-call/-fno-multi-level-call,
-fdebug-call/-fno-debug-call, -fmacro-call/-fno-macro-call,
-fstack-uav/-fno-stack-uav, -fdef-res-id/-fno-def-res-id,
-wokth=int/--waves-opt-kernel-threshold,
-ilkth=int/--inline-kernel-size-threshold,
-ilsth=int/--inline-size-threshold, -ilcth=int/--inline-cost-threshold,
-scopt=int/--sc-opt-level, -flib-no-inline/-fno-lib-no-inline,
-fuser-no-inline/-fno-user-no-inline,
-scras=int/--sc-si-opt-reg-alloc-strategy, -fsc-post-ra-sched,
-fsc-live-sched/-fno-sc-live-sched, -fsc-use-buffer-for-hsa-global,
-fsc-schedule-no-reorder, -fsc-min-reg-schedule,
-fsc-bias-schedule-to-minimize-insts,
-fsc-bias-schedule-to-minimize-regs, -fsc-disable-merge-memory,
-fsc-disable-loop-unroll, -fsc-use-mubuf/-fno-sc-use-mubuf,
-fsc-selective-inline/-fno-sc-selective-inline,
-fsc-keep-calls/-fno-sc-keep-calls, -slc=0|1/--simplifylibcall,
-stack-alignment=<n>, -fdiv2fmul=0|1, -prt-opt-liveness=0|1,
-liveness=0|1, -SRAE-threshold=<value>, -memcombine-max-vec-gen=<value>,
-small-global-objects, -fast-fmaf, -fast-fma, -bfo=0|1, -ebb=0|1, -aa,
-mem2reg=0|1, -licm=0|1, -unroll-allow-partial,
-unroll-threshold=<positive integer>, -unroll-count=<positive integer>,
-apt/--ap-threshold=<positive integer>, -srt/--sr-threshold=<positive
integer>, -fdebug-linker/-fno-debug-linker, -fbin-gpu64/-fno-bin-gpu64,
-fbin-disasm/-fno-bin-disasm, -fbin-bif30, -fbin-hsail/-fno-bin-hsail,
-fbin-amdil/-fno-bin-amdil, -fbin-spir/-fno-bin-spir, -fonly-bin-source,
-fper-pointer-uav/-fno-per-pointer-uav
Co-authored-by: Konstantin Zhuravlyov <kzhuravl_dev@outlook.com >
2025-12-10 09:09:12 -08:00
Dominic Widdows
75bea883e1
Remove redirect notice from redirect target ( #2104 )
...
README is copied from https://github.com/ROCm/clr which redirects to target
https://github.com/ROCm/rocm-systems/tree/develop/projects/clr
This is correct for https://github.com/ROCm/clr I think, but unnecessary for https://github.com/ROCm/rocm-systems/edit/develop/projects/clr/README.md which is already the correct redirect target.
2025-12-09 10:47:51 -08:00
Victor Zhang
aaecffa50b
SWDEV-568847 - prevent UAF when registering callbacks on completed events ( #2066 )
...
* SWDEV-568847 - prevent UAF when registering callbacks on completed events
* cache the status() of event earlier
* Update command.cpp
* revert cl_event.cpp
* Update cl_event.cpp
---------
Co-authored-by: cadolphe-amd <chris.adolphe@amd.com >
2025-12-09 11:38:45 -05:00
Jatin Chaudhary
eea93d58a2
SWDEV-554626 - return correct error code ( #1107 )
...
* SWDEV-554626 - return hipErrorInvalidDeviceFunction when we can not load module
Return correct error code when modules are empty
* Match the error codes
* Revert the error code
2025-12-09 16:10:25 +01:00
SaleelK
acc236fd89
clr: Avoid saving all ProfilingSignals at once ( #2108 )
...
* While reusing signals, its possible we can come across a timestamp
that can contain several signals, like when profiling a graph. Reading
timestamps from all signals can make the call severely CPU bound.
Instead cache only that signal so as to avoid the overhead for critical
path.
2025-12-08 11:32:16 -08:00
Jin Jung
deaf8ab38a
SWDEV-567119 - Windows GL Interop Support ( #1892 )
2025-12-08 11:03:59 -05:00
Shadi Dashmiz
4812d8e78b
SWDEV-566783 - clean up cmgr helper ( #1864 )
...
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com >
2025-12-08 10:37:03 -05:00
Ioannis Assiouras
3faf36fb25
Fix Unit_hipStreamBeginCaptureToGraph_CapturePartialInThreads ( #2072 )
...
https://mlsejenkinsvm.amd.com/job/rocm-systems/job/hip/view/change-requests/job/PR-2072/6/
The last windowsCI has passed successfully
2025-12-08 13:30:23 +01:00
Lancelot Six
659737c824
clr: Bump _amdgpu_r_debug.r_version to 11 ( #2063 )
2025-12-05 16:01:08 -05:00
Rahul Manocha
9dd3c2fa70
SWDEV-563271 - return error when pal cmd submission fails ( #1585 )
2025-12-05 14:25:01 -05:00
Ajay GunaShekar
d6f6435b88
SWDEV-526504 - Remove perl dependency in hip/clr build ( #964 )
...
* SWDEV-1 - Remove perl dependency in hip/clr build
* SWDEV-1 - use python3 inplace of perl for formatting date,time
2025-12-05 08:42:15 -08:00
Julia Jiang
272f06506f
SWDEV-549696 - Fix HIP catch sub-test failure for MipmappedArray ( #1198 )
...
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com >
2025-12-05 11:00:06 -05:00
systems-assistant[bot]
06a3a5ca10
SWDEV-546110 - Fix encoding for certain types ( #446 )
2025-12-05 13:16:14 +00:00
harkgill-amd
8f622de972
Add gfx1152 support to PAL ( #2077 )
2025-12-03 10:39:22 -08:00
Matt Arsenault
d75d0bc1c9
SWDEV-548892 - Stop using ocml exp and exp2 functions ( #2032 )
2025-12-02 13:39:09 -05:00
Ioannis Assiouras
65b769ee16
SWDEV-569101 - increase signal list size to at least DEBUG_HIP_GRAPH_BATCH_SIZE ( #2084 )
2025-12-01 18:52:51 -08:00
SaleelK
c105dcd05b
clr: Use graph segment scheduling to process HIP Graphs ( #1372 )
...
* clr: Use graph segment scheduling to process HIP Graphs
* Add a broader path to use capture packet capture for all topologies
* Refactor code
* Use DEBUG_HIP_GRAPH_SEGMENT_SCHEDULING to toggle new vs classic path,
Enabled by default
* clr: Few fixes and improvements
* clr: Detect complex graphs to take classic path
* Use DEBUG_HIP_GRAPH_SEGMENT_SCHEDULING=2 to force segment scheduling
path
* clr: Fix a cornercase stack corruption
* clr: Track commands of segments instead of snapshots
* clr: Fix Batch dispatch logic
* Track fence_dirty_ flag for command of other streams
* Dependency resolution markers can now accomodate dirty fence on cross
streams
---------
Co-authored-by: Ioannis Assiouras <Ioannis.Assiouras@amd.com >
Co-authored-by: Godavarthy Surya, Anusha <agodavar@amd.com >
2025-12-01 12:49:26 -08:00
vstojilj
77f58ceb9f
SWDEV-558557 - Remove duplicate nodes when capturing hipMemcpyAsync ( #1226 )
2025-12-01 11:25:13 +01:00
vstojilj
1c09c87cc7
SWDEV-564927 - Allow sizeBytes to be 0 when hipMemsetAsync is captured ( #1849 )
2025-11-27 17:13:33 +01:00
Godavarthy Surya, Anusha
2e1c37a926
SWDEV-490861 - Remove recursion and extra loop in hipGraphLaunch ( #1792 )
2025-11-27 10:25:08 +00:00
Shadi Dashmiz
962b99f925
SWDEV-567514: Remove default stream wait ( #1977 )
...
- when virtual map command is called
- can create deadlock
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com >
2025-11-26 15:11:52 -05:00
AidanBeltonS
d849b88aef
SWDEV-558080 - Add recommended granularity ( #1176 )
...
* Add recommended granularity
* Improve granularity testing
* Update based on feedback
2025-11-26 16:10:58 +00:00
Matt Arsenault
f089217e6a
SWDEV-548892 - Stop using ockl steadyctr function ( #1882 )
...
Directly use the builtin
2025-11-26 09:29:06 -05:00
Todd tiantuo Li
ee48f6221d
SWDEV-562708 - change default maximum SVM size to 256GB ( #1731 )
2025-11-25 23:59:39 -08:00
Matt Arsenault
9fbb062505
SWDEV-548892 - Stop using ocml isinf wrapper ( #1854 )
2025-11-25 22:21:37 -05:00
Karthik Jayaprakash
740a06d567
SWDEV-559267 - Use CLPrint to DevLogPrintf with Log Level - detail debug. ( #1160 )
2025-11-25 19:25:32 -05:00
German Andryeyev
93682f2f75
SWDEV-567852 - Clean-up hip::init() ( #1948 )
2025-11-25 19:05:41 -05:00
cadolphe-amd
cce94f6ee0
SWDEV-557412 - Incorporate proper chunk offset when remapping virtual memory ( #1848 )
...
* SWDEV-557412 - Incorporate proper offset when remapping virtual memory
* Fix condition to check if VMHeap allocation address matches a chunk address
* Move offset calculation outside if/else block
---------
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com >
2025-11-25 18:05:25 -05:00
Victor Zhang
ede71ca3b0
SWDEV-567829 - populateFormatStringHashMap: relax printf hash collisi… ( #1944 )
...
* SWDEV-567829 - populateFormatStringHashMap: relax printf hash collision check for duplicate format strings
* function optimized by ai
2025-11-25 17:19:27 -05:00
German Andryeyev
2c5754844f
SWDEV-465041 - Enable direct dispatch under Linux by default. ( #1934 )
2025-11-25 11:30:32 -05:00
Pengda Xie
6c31785eaf
SWDEV-562761 - Cleanup static fatbin on runtime teardown ( #1873 )
2025-11-24 21:57:46 -08:00
Marius Brehler
2dc32d645b
Explicitly load versioned libamdhip64.so ( #1872 )
...
* Explicitly load versioned libamdhip64.so
* Fix syntax errors
* Fix when patching happens in Windows workflow
---------
Co-authored-by: Joseph Macaranas <145489236+jayhawk-commits@users.noreply.github.com >
Co-authored-by: ammallya <ameyakeshava.mallya@amd.com >
2025-11-24 10:05:05 -08:00
sluzynsk-amd
2cf9faa93f
SWDEV-563777 - fix warnings related to inconsistent overrides ( #1625 )
...
This patch adds missing override keywords. Fixes this class of warnings.
Signed-off-by: Sebastian Luzynski <Sebastian.Luzynski@amd.com >
2025-11-24 18:50:07 +01:00
AidanBeltonS
0580e2053c
SWDEV-533546, SWDEV-540027 - Add e8m0 conversions and testing ( #987 )
...
* SWDEV-533546 - Add conversion functions for e8m0
* SWDEV-533546 - remove whitespace
* Add testing
* Update based on feedback
* Copilot suggestions
---------
Co-authored-by: systems-assistant[bot] <systems-assistant[bot]@users.noreply.github.com>
2025-11-24 09:14:03 +00:00
Ioannis Assiouras
36029ea1a8
SWDEV-559166 - Fix race condition in getDemangledName ( #1868 )
2025-11-23 08:45:45 +00:00