13250 커밋

작성자 SHA1 메시지 날짜
Matt Arsenault 0c0d8dc974 SWDEV-548892 - Stop using __ockl_lane_id (#2186)
__lane_id already exists and is identical.
2025-12-19 20:34:55 +01:00
Sourabh U Betigeri 883fdfb820 Revert "clr: Minor fixes for error return" (#2399)
- This reverts commit 8dd8436e43c7f0d062fd73252bf61c35615d181d.
- Resolve MIOpen test failures observed in TheRock
- TheRock Issue: ROCm/TheRock#2642
- room-systems issue: #2400
2025-12-18 18:40:13 -05:00
Jatin Chaudhary fdf73116d5 Do not allocate code objects when we map a static code object (#2332) 2025-12-18 09:22:02 +00:00
Maneesh Gupta 4a9833e70e Revert "Add HasExpertSchedMode device prop (#2241)" (#2371)
This reverts commit c0b4aef5ad.
2025-12-17 21:26:44 -08:00
Shadi Dashmiz 96f6b6e251 SWDEV-571304 : Fix the constructor for __half (#2240)
- comply with cuda

- Fix usecase for constexpr

Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
2025-12-17 11:15:20 -05:00
Filip Jankovic c0b4aef5ad Add HasExpertSchedMode device prop (#2241)
* Add HasExpertSchedMode device prop

* Add unit tests for HasExpertSchedMode

* Add gfx12 check for HasExpertSchedMode prop

* Update gfx major version check and test for ExpertSchedMode

* Minor fix and ROCr version bump

* Update projects/rocr-runtime/runtime/hsa-runtime/inc/hsa_ext_amd.h

* Update projects/rocr-runtime/runtime/hsa-runtime/inc/hsa_ext_amd.h

* Apply suggestion from @dayatsin-amd

* Apply suggestion from @dayatsin-amd

---------

Co-authored-by: Stefan Sokolovic <stefan.sokolovic2@amd.com>
Co-authored-by: David Yat Sin <77975354+dayatsin-amd@users.noreply.github.com>
2025-12-17 17:06:08 +01:00
randyh62 1240b592a5 Git url fix (#2285)
* Update README-doc.md

Correct GitHub URL for components moved into rocm-systems

* Update amd_clr.rst

Update github.com URLs

* Update Dockerfile

Update rocm-systems paths

* Update CONTRIBUTING.md

update for rocm-systems

* Update CONTRIBUTING.md

minor change

* Update CONTRIBUTING.md

* Update CONTRIBUTING.md

* Update hip_runtime_api.rst

Update for rocm-systems

* Update installation.rst

update URL to libhsakmt

* Update what_is_hip.rst

* Update projects/clr/CONTRIBUTING.md

Co-authored-by: Dominic Widdows <dwiddows@gmail.com>

* Update projects/clr/README-doc.md

Co-authored-by: Dominic Widdows <dwiddows@gmail.com>

* Update Dockerfile

Update git clone for sparse checkout

* Update projects/hip/CONTRIBUTING.md

* Update projects/clr/CONTRIBUTING.md

* Update projects/hipother/CONTRIBUTING.md

---------

Co-authored-by: Dominic Widdows <dwiddows@gmail.com>
2025-12-15 11:57:18 -08:00
systems-assistant[bot] b002c6a739 SWDEV-538607 - Add SIMDe as a build dependency, remove naked intrinsic use. (#500)
Co-authored-by: Alex Voicu <alexandru.voicu@amd.com>
Co-authored-by: Ioannis Assiouras <Ioannis.Assiouras@amd.com>
2025-12-15 17:40:51 +00:00
Matt Arsenault 49565f9d9f SWDEV-548892 - Always declare used ocml and ockl device libs functions (#2230)
Ignore __CLANG_HIP_RUNTIME_WRAPPER_INCLUDED__. This should not be relying
on declarations from the clang builtin headers. There is no issue declaring
the same intrinsics multiple times. This will enable removal of declarations
from the clang builtin headers.
2025-12-15 17:23:33 +01:00
Fábio Mestre 447beeb00b Replace usages of __ockl_gws_init with __builtin_amdgcn_ds_gws_init (#2235) 2025-12-15 16:56:14 +01:00
Dominic Widdows 9a8ed9f45d Doc updates updating internal links from deprecated repos to rocm-systems project locations (#2294)
* Update README documentation links for clarity and consistency across projects

- Changed links in the README files for `clr`, `hipother`, and `hip-tests` to use relative paths instead of absolute URLs, improving navigation within the repository.

* Update CONTRIBUTING documentation to use relative links for improved navigation

- Changed absolute URLs to relative paths in the CONTRIBUTING.md files for the hip and hipother projects, enhancing consistency and ease of access within the repository.
2025-12-12 13:21:42 -08:00
SaleelK 840301e12d clr: Minor fixes for error return (#2153) 2025-12-11 16:59:56 -08:00
SaleelK 10635483ad clr: Fix packet batch write logic (#2236)
* When writing bulk packets always invalidate packet headers, Its
  possible that the CP fetcher can have multiple packets in flight. In
such cases we may end up with a malformed packet because the writes are
not complete yet CP finds a valid header.
2025-12-11 04:26:41 -08:00
Matt Arsenault a495d1137e SWDEV-548892 - Make declaration of __ockl_fdot2 always available (#2229) 2025-12-11 11:53:11 +01:00
German Andryeyev 3895aadba6 SWDEV-558849 - Make ROCR path in Windows more stable (#2181) 2025-12-10 12:37:10 -05:00
Pengda Xie 1d6b26f829 SWDEV-556684 - HSAIL Cleanup re-apply commit 4abdfe5: (#2024)
Removed some options

-xnack, -force-wgp-mode, -force-wave-size-32, -round-trip-spirv,
-fe-gen-spirv, -lower-pipe-builtins=0|1, -lower-atomics=0|1,
-set-lds=<value>, -set-scalar-registers=<value>,
-set-vector-registers=<value>, -limit-scalar-registers=<value>,
-limit-vector-registers=<value>, -sc-xnack-iommu,
-faa-for-barrier/-fno-a-for-barrier, -sc-dev-format, -verify-lwspir,
-verify-hwspir, -ffma-enable/-fno-fma-enable,
-fmad-enable/-fno-mad-enable, -fdisable-avx/-fno-disable-avx,
-fforce-llvm/-fno-force-llvm, -print-compile-phases,
-kernel-cache-enforce-miss, -kernel-cache-wipe, -kernel-cache,
-sc[=<filename>]/--load-sc-dll[=<filename>],
-be[=<filename>]/--load-be-dll[=<filename>],
-cg[=<filename>]/--load-cg-dll[=<filename>],
-link[=<filename>]/--load-link-dll[=<filename>],
-opt[=<filename>]/--load-opt-dll[=<filename>],
-fe[=<filename>]/--load-fe-dll[=<filename>],
-cl[=<filename>]/--load-cl-dll[=<filename>], -just-kernel=<kernel-name>,
-use-debugil, -fmulti-level-call/-fno-multi-level-call,
-fdebug-call/-fno-debug-call, -fmacro-call/-fno-macro-call,
-fstack-uav/-fno-stack-uav, -fdef-res-id/-fno-def-res-id,
-wokth=int/--waves-opt-kernel-threshold,
-ilkth=int/--inline-kernel-size-threshold,
-ilsth=int/--inline-size-threshold, -ilcth=int/--inline-cost-threshold,
-scopt=int/--sc-opt-level, -flib-no-inline/-fno-lib-no-inline,
-fuser-no-inline/-fno-user-no-inline,
-scras=int/--sc-si-opt-reg-alloc-strategy, -fsc-post-ra-sched,
-fsc-live-sched/-fno-sc-live-sched, -fsc-use-buffer-for-hsa-global,
-fsc-schedule-no-reorder, -fsc-min-reg-schedule,
-fsc-bias-schedule-to-minimize-insts,
-fsc-bias-schedule-to-minimize-regs, -fsc-disable-merge-memory,
-fsc-disable-loop-unroll, -fsc-use-mubuf/-fno-sc-use-mubuf,
-fsc-selective-inline/-fno-sc-selective-inline,
-fsc-keep-calls/-fno-sc-keep-calls, -slc=0|1/--simplifylibcall,
-stack-alignment=<n>, -fdiv2fmul=0|1, -prt-opt-liveness=0|1,
-liveness=0|1, -SRAE-threshold=<value>, -memcombine-max-vec-gen=<value>,
-small-global-objects, -fast-fmaf, -fast-fma, -bfo=0|1, -ebb=0|1, -aa,
-mem2reg=0|1, -licm=0|1, -unroll-allow-partial,
-unroll-threshold=<positive integer>, -unroll-count=<positive integer>,
-apt/--ap-threshold=<positive integer>, -srt/--sr-threshold=<positive
integer>, -fdebug-linker/-fno-debug-linker, -fbin-gpu64/-fno-bin-gpu64,
-fbin-disasm/-fno-bin-disasm, -fbin-bif30, -fbin-hsail/-fno-bin-hsail,
-fbin-amdil/-fno-bin-amdil, -fbin-spir/-fno-bin-spir, -fonly-bin-source,
-fper-pointer-uav/-fno-per-pointer-uav

Co-authored-by: Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>
2025-12-10 09:09:12 -08:00
Dominic Widdows 75bea883e1 Remove redirect notice from redirect target (#2104)
README is copied from https://github.com/ROCm/clr which redirects to target
https://github.com/ROCm/rocm-systems/tree/develop/projects/clr

This is correct for https://github.com/ROCm/clr I think, but unnecessary for https://github.com/ROCm/rocm-systems/edit/develop/projects/clr/README.md which is already the correct redirect target.
2025-12-09 10:47:51 -08:00
Victor Zhang aaecffa50b SWDEV-568847 - prevent UAF when registering callbacks on completed events (#2066)
* SWDEV-568847 - prevent UAF when registering callbacks on completed events

* cache the status() of event earlier

* Update command.cpp

* revert cl_event.cpp

* Update cl_event.cpp

---------

Co-authored-by: cadolphe-amd <chris.adolphe@amd.com>
2025-12-09 11:38:45 -05:00
Jatin Chaudhary eea93d58a2 SWDEV-554626 - return correct error code (#1107)
* SWDEV-554626 - return hipErrorInvalidDeviceFunction when we can not load module
Return correct error code when modules are empty

* Match the error codes

* Revert the error code
2025-12-09 16:10:25 +01:00
SaleelK acc236fd89 clr: Avoid saving all ProfilingSignals at once (#2108)
* While reusing signals, its possible we can come across a timestamp
  that can contain several signals, like when profiling a graph. Reading
timestamps from all signals can make the call severely CPU bound.
Instead cache only that signal so as to avoid the overhead for critical
path.
2025-12-08 11:32:16 -08:00
Jin Jung deaf8ab38a SWDEV-567119 - Windows GL Interop Support (#1892) 2025-12-08 11:03:59 -05:00
Shadi Dashmiz 4812d8e78b SWDEV-566783 - clean up cmgr helper (#1864)
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
2025-12-08 10:37:03 -05:00
Ioannis Assiouras 3faf36fb25 Fix Unit_hipStreamBeginCaptureToGraph_CapturePartialInThreads (#2072)
https://mlsejenkinsvm.amd.com/job/rocm-systems/job/hip/view/change-requests/job/PR-2072/6/
The last windowsCI has passed successfully
2025-12-08 13:30:23 +01:00
Lancelot Six 659737c824 clr: Bump _amdgpu_r_debug.r_version to 11 (#2063) 2025-12-05 16:01:08 -05:00
Rahul Manocha 9dd3c2fa70 SWDEV-563271 - return error when pal cmd submission fails (#1585) 2025-12-05 14:25:01 -05:00
Ajay GunaShekar d6f6435b88 SWDEV-526504 - Remove perl dependency in hip/clr build (#964)
* SWDEV-1 - Remove perl dependency in hip/clr build
* SWDEV-1 - use python3 inplace of perl for formatting date,time
2025-12-05 08:42:15 -08:00
Julia Jiang 272f06506f SWDEV-549696 - Fix HIP catch sub-test failure for MipmappedArray (#1198)
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
2025-12-05 11:00:06 -05:00
systems-assistant[bot] 06a3a5ca10 SWDEV-546110 - Fix encoding for certain types (#446) 2025-12-05 13:16:14 +00:00
harkgill-amd 8f622de972 Add gfx1152 support to PAL (#2077) 2025-12-03 10:39:22 -08:00
Matt Arsenault d75d0bc1c9 SWDEV-548892 - Stop using ocml exp and exp2 functions (#2032) 2025-12-02 13:39:09 -05:00
Ioannis Assiouras 65b769ee16 SWDEV-569101 - increase signal list size to at least DEBUG_HIP_GRAPH_BATCH_SIZE (#2084) 2025-12-01 18:52:51 -08:00
SaleelK c105dcd05b clr: Use graph segment scheduling to process HIP Graphs (#1372)
* clr: Use graph segment scheduling to process HIP Graphs

* Add a broader path to use capture packet capture for all topologies
* Refactor code
* Use DEBUG_HIP_GRAPH_SEGMENT_SCHEDULING to toggle new vs classic path,
  Enabled by default

* clr: Few fixes and improvements

* clr: Detect complex graphs to take classic path

* Use DEBUG_HIP_GRAPH_SEGMENT_SCHEDULING=2 to force segment scheduling
  path

* clr: Fix a cornercase stack corruption

* clr: Track commands of segments instead of snapshots

* clr: Fix Batch dispatch logic

* Track fence_dirty_ flag for command of other streams
* Dependency resolution markers can now accomodate dirty fence on cross
  streams

---------

Co-authored-by: Ioannis Assiouras <Ioannis.Assiouras@amd.com>
Co-authored-by: Godavarthy Surya, Anusha <agodavar@amd.com>
2025-12-01 12:49:26 -08:00
vstojilj 77f58ceb9f SWDEV-558557 - Remove duplicate nodes when capturing hipMemcpyAsync (#1226) 2025-12-01 11:25:13 +01:00
vstojilj 1c09c87cc7 SWDEV-564927 - Allow sizeBytes to be 0 when hipMemsetAsync is captured (#1849) 2025-11-27 17:13:33 +01:00
Godavarthy Surya, Anusha 2e1c37a926 SWDEV-490861 - Remove recursion and extra loop in hipGraphLaunch (#1792) 2025-11-27 10:25:08 +00:00
Shadi Dashmiz 962b99f925 SWDEV-567514: Remove default stream wait (#1977)
- when virtual map command is called

- can create deadlock

Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
2025-11-26 15:11:52 -05:00
AidanBeltonS d849b88aef SWDEV-558080 - Add recommended granularity (#1176)
* Add recommended granularity

* Improve granularity testing

* Update based on feedback
2025-11-26 16:10:58 +00:00
Matt Arsenault f089217e6a SWDEV-548892 - Stop using ockl steadyctr function (#1882)
Directly use the builtin
2025-11-26 09:29:06 -05:00
Todd tiantuo Li ee48f6221d SWDEV-562708 - change default maximum SVM size to 256GB (#1731) 2025-11-25 23:59:39 -08:00
Matt Arsenault 9fbb062505 SWDEV-548892 - Stop using ocml isinf wrapper (#1854) 2025-11-25 22:21:37 -05:00
Karthik Jayaprakash 740a06d567 SWDEV-559267 - Use CLPrint to DevLogPrintf with Log Level - detail debug. (#1160) 2025-11-25 19:25:32 -05:00
German Andryeyev 93682f2f75 SWDEV-567852 - Clean-up hip::init() (#1948) 2025-11-25 19:05:41 -05:00
cadolphe-amd cce94f6ee0 SWDEV-557412 - Incorporate proper chunk offset when remapping virtual memory (#1848)
* SWDEV-557412 - Incorporate proper offset when remapping virtual memory

* Fix condition to check if VMHeap allocation address matches a chunk address

* Move offset calculation outside if/else block

---------

Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
2025-11-25 18:05:25 -05:00
Victor Zhang ede71ca3b0 SWDEV-567829 - populateFormatStringHashMap: relax printf hash collisi… (#1944)
* SWDEV-567829 - populateFormatStringHashMap: relax printf hash collision check for duplicate format strings

* function optimized by ai
2025-11-25 17:19:27 -05:00
German Andryeyev 2c5754844f SWDEV-465041 - Enable direct dispatch under Linux by default. (#1934) 2025-11-25 11:30:32 -05:00
Pengda Xie 6c31785eaf SWDEV-562761 - Cleanup static fatbin on runtime teardown (#1873) 2025-11-24 21:57:46 -08:00
Marius Brehler 2dc32d645b Explicitly load versioned libamdhip64.so (#1872)
* Explicitly load versioned libamdhip64.so

* Fix syntax errors

* Fix when patching happens in Windows workflow

---------

Co-authored-by: Joseph Macaranas <145489236+jayhawk-commits@users.noreply.github.com>
Co-authored-by: ammallya <ameyakeshava.mallya@amd.com>
2025-11-24 10:05:05 -08:00
sluzynsk-amd 2cf9faa93f SWDEV-563777 - fix warnings related to inconsistent overrides (#1625)
This patch adds missing override keywords. Fixes this class of warnings.

Signed-off-by: Sebastian Luzynski <Sebastian.Luzynski@amd.com>
2025-11-24 18:50:07 +01:00
AidanBeltonS 0580e2053c SWDEV-533546, SWDEV-540027 - Add e8m0 conversions and testing (#987)
* SWDEV-533546 - Add conversion functions for e8m0

* SWDEV-533546 - remove whitespace

* Add testing

* Update based on feedback

* Copilot suggestions

---------

Co-authored-by: systems-assistant[bot] <systems-assistant[bot]@users.noreply.github.com>
2025-11-24 09:14:03 +00:00
Ioannis Assiouras 36029ea1a8 SWDEV-559166 - Fix race condition in getDemangledName (#1868) 2025-11-23 08:45:45 +00:00