Граф коммитов

12806 Коммитов

Автор SHA1 Сообщение Дата
Xie, Jiabao(Jimbo) e1d2194b75 SWDEV-528913 - support gfx950 in rocsetting (#217)
* SWDEV-528913 - support gfx950 in rocsetting

---------

Co-authored-by: Jimbo Xie <jiabaxie@amd.com>

[ROCm/clr commit: a320a3f214]
2025-05-07 15:44:49 -04:00
Lambert, Jacob dc1c1e3199 SWDEV-518221 - Don't link against libamd_comgr.so at runtime
Convention is to always link against .so.* at runtime.
Having it link against .so will break on systems that package
the .so files in their dev/devel package.

This issue was found when building ROCm 6.4 for Fedora.

Commiting on behalf of GitHub user Mystro256

[ROCm/clr commit: 6b12154583]
2025-05-07 11:56:41 -07:00
Zhang, Victor fbabd2b69d SWDEV-528142 - add error check for KernelParameters::capture (#276)
* SWDEV-528142 - add error check for KernelParameters::capture

* Update kernel.cpp

---------

Co-authored-by: victzhan <victzhan@amd.com>

[ROCm/clr commit: f960433dcd]
2025-05-07 09:52:09 -04:00
Jayaprakash, Karthik cde2a250ec SWDEV-493805 - Cleaning up launch parameters arguments. (#241)
[ROCm/clr commit: fa55557f46]
2025-05-06 15:06:13 -04:00
Dittakavi, Satyanvesh 086a1c289a SWDEV-529831 - Return error if the program is empty (#257)
[ROCm/clr commit: 607f8f26fd]
2025-05-06 15:12:12 +05:30
Chaudhary, Jatin Jaikishan b5f67d4804 SWDEV-529854 - __hmax/__hmin should handle nan's (#246)
[ROCm/clr commit: a71c6eb1a0]
2025-05-06 09:42:15 +01:00
Chaudhary, Jatin Jaikishan a12739ecd9 SWDEV-529927 - add missing operations for fp16/bf16 (#238)
[ROCm/clr commit: b1ebf33850]
2025-05-06 09:41:21 +01:00
Andryeyev, German 3ea758a2d4 SWDEV-528808 - Release all HW queues even if only one is idle (#240)
Pytorch may not explicitly idle each queue. Thus, some queues can be considered as busy,
but have idle state in reality


[ROCm/clr commit: 65a0181a7c]
2025-05-05 19:09:01 -04:00
Guan, Zichuan ee91a1e94a Disable HIP_PLATFORM auto-detect if already defined (#254)
Co-authored-by: Stella Laurenzo <stellaraccident@gmail.com>

[ROCm/clr commit: 3775298655]
2025-05-05 15:37:53 -04:00
Arsenault, Matthew 13d8f9adae SWDEV-1 - Stop using ocml rounding functions (#228)
Directly use the builtins. Use the elementwise versions since there's
no implied errno, regardless of -f[no]-math-errno.

I didn't change the cases unnecessarily casting. The bfloat and vector
cases should work directly.

[ROCm/clr commit: 1db9a7d48b]
2025-05-05 19:35:12 +02:00
Andryeyev, German c512258e45 SWDEV-528808 - Disable dynamic queue by default (#256)
Dynamic queue management will be disabled by default and
the original sort logic is restored

[ROCm/clr commit: 9b018165ce]
2025-05-05 10:56:35 -04:00
Searles, Mark e480220c6a Fix typos in warning msgs (#231)
[ROCm/clr commit: cd9bc61559]
2025-05-02 14:31:42 -07:00
Chaudhary, Jatin Jaikishan f619372ae6 SWDEV-514560 - add fp6 header implementation (#54)
Co-authored-by: rahul manocha <rmanocha_amdeng>

[ROCm/clr commit: 12febe6782]
2025-05-01 15:17:38 +01:00
Assiouras, Ioannis 3d4ff304d7 SWDEV-521011 - Fix alignment in PalResource::CreateSvm
[ROCm/clr commit: 9d6a0d1a4d]
2025-05-01 02:22:49 +01:00
Andryeyev, German 13c7977d50 SWDEV-526836 - add PipelineStageBlt flag (#229)
CP sync requires PipelineStageBlt flag.


[ROCm/clr commit: 84a4f293f4]
2025-04-30 14:27:41 -04:00
Assiouras, Ioannis 4efd624960 SWDEV-525593, SWDEV-527293 - Acquire active queue after xferQueue is created (#165)
For xferQueue VirtualGPU::create is called after ProfilingBegin
so the active queue needs to be acquired.

[ROCm/clr commit: d3fb8eda8b]
2025-04-30 09:21:11 +01:00
Godavarthy Surya, Anusha e4a499f22e SWDEV-522841 - Graph nodes must be created/launched on device where they are captured/created (#108)
[ROCm/clr commit: 2538d7f02b]
2025-04-29 22:20:39 +05:30
Jiang, Julia 6ab34e0924 SWDEV-522634 - Fix device properties in hipInfo (#203)
[ROCm/clr commit: eb62fe9f62]
2025-04-29 11:29:47 -04:00
Sang, Tao 68deb3d10a SWDEV-520352 - Remove HostThread and legacy monitor (#230)
* SWDEV-520352 - Remove HostThread and legacy monitor

Remove HostThread, semaphore and  legacy monitor.
Make original logics of thread and command queue stricker.
Add more comments to make logics clearer.
Some other minor improvement.

Also part of SWDEV-458943.

[ROCm/clr commit: 96cadbc9e9]
2025-04-29 09:55:24 -04:00
GunaShekar, Ajay c4567a9188 SWDEV-523028 - print PAL failure return values in logs (#81)
* print PAL failure return values in logs
* dump kernel info incase of PAL failure

[ROCm/clr commit: 99ef573399]
2025-04-29 11:23:43 +05:30
Jayaprakash, Karthik 0071d33754 SWDEV-522707 - Set phys_mem_handle type to sizeof(size_t) to avoid blocking address range. (#105)
[ROCm/clr commit: 6811fd90b8]
2025-04-29 11:19:16 +05:30
Jayaprakash, Karthik 49a527c826 SWDEV-506467 - Skip Abort in case of crash from the device. (#60)
Change-Id: I964b2f2647d068202e9c38fcddb1337da754df8d

[ROCm/clr commit: b2388dfb88]
2025-04-29 11:19:02 +05:30
Betigeri, Sourabh ae0640131e SWDEV-528351 - Removes unused code and asserts to improve coverage (#219)
[ROCm/clr commit: 9cf3f1e461]
2025-04-28 14:40:35 -07:00
Critchley, Paul 4d3978e094 SWDEV-523611 - [Tools][OCL] OpenCL fails to capture with PalTrace (#198)
Finalize DevDriver initialization after device creation

[ROCm/clr commit: 7e9d5eab7c]
2025-04-28 08:02:34 -07:00
Godavarthy Surya, Anusha ff69bcc903 SWDEV-469422 - Avoid using of hipStream_t in internal methods (#69)
Change-Id: Ifd5362f371c846a88241927383cb95cf046548ef

[ROCm/clr commit: fb92683d86]
2025-04-28 15:09:11 +05:30
Godavarthy Surya, Anusha 0eb2e5e8f2 SWDEV-469422 - hipGraphNodeDOTAttribute change std::string members to const char* (#70)
Compiler creates global variables for every unique string

Change-Id: I4cf8dd3e763d16740096e345da67a7ef72f61515

[ROCm/clr commit: bbcb1f9c70]
2025-04-28 14:57:36 +05:30
Assiouras, Ioannis 875468bbfb SWDEV-526188 - Fix race condition in StatCO::getStatFunc()
Make sure that a newly created FatBinaryInfo is assigned to modules only after extractFatBinary has been called for the object.

[ROCm/clr commit: 1099e0a131]
2025-04-27 21:14:01 +01:00
Kudchadker, Saleel cd14def193 SWDEV-521647 - Fix tracking of hw_event (#206)
- When a command may possibly have two packets(like device heap
  initializer), and if there is no signal on the main kernel packet the
tracking was broken as it marked HW event of the command as the first
packet signal.
- Make sure if no completion signal is attached to the second packet
  then clear the HW event for the command.

[ROCm/clr commit: 072fb0804e]
2025-04-25 08:46:44 -07:00
Kudchadker, Saleel 1b1d6b841e SWDEV-510186 - Improve logging (#220)
- Print all arguments for logs, this is useful for debug

[ROCm/clr commit: ce24936970]
2025-04-25 08:40:31 -07:00
Li, Todd tiantuo 8706df3726 SWDEV-511055 - fix HIP PAL memory allocation workaround for APU (#40)
[ROCm/clr commit: 95cdc83eaf]
2025-04-24 15:07:16 -07:00
Sang, Tao a9068182f4 SWDEV-493275 - Support scratch limit (#20)
Support programmatic query and change of scratch limit on
AMD devices.

Change-Id: Id5da355a77366f97868e462847f3916e87fd2af6

[ROCm/clr commit: 1113eff3f9]
2025-04-24 17:15:25 -04:00
Critchley, Paul 4fd8a0164a SWDEV-527731 - [Ubertrace] OpenCL driver reports wrong Instrumentation API Version (#211)
[ROCm/clr commit: 4f2a4b12a9]
2025-04-24 14:06:17 -07:00
Godavarthy Surya, Anusha d4e36d0900 SWDEV-469423 - hipStreamEndCapture graph* can be nullptr (#170)
[ROCm/clr commit: e5ce544c45]
2025-04-24 13:57:09 +05:30
Hila, Nino 9d133c383f Add palamida.yml (#215)
[ROCm/clr commit: 38d48c9a7d]
2025-04-23 13:15:09 -07:00
Sang, Tao 60110b6c01 SWDEV-518831 - fix streams' sync issue in mthreads (#123)
* SWDEV-518831 - fix streams' sync issue in mthreads

1. Fix sync issue of null stream and non-null streams in
multithreads.
2. Remove assert(GetSubmissionBatch() == nullptr) as it
is invalid in multithreads.
3. Update getActiveQueues() to deal with the state of 
being terminated.

[ROCm/clr commit: 27aad09bd4]
2025-04-23 15:08:07 -04:00
Sang, Tao 45b75013ec SWDEV-516050 - Fix monitor hang in OCL (#75)
Fix monitor hang in cts integer_ops.
Improve notify().
Won't affect notifyAll() and Hip in direct
dispatch mode.

Change-Id: I95a458358e1cab9c76aefde117db09cdbd1fd3af

[ROCm/clr commit: 78f92901d8]
2025-04-23 14:34:53 -04:00
Xie, Jiabao(Jimbo) c7737558a4 SWDEV-441487 - add gfx1150/1 support to amd-staging clr (#182)
Co-authored-by: Jimbo Xie <jiabaxie@amd.com>

[ROCm/clr commit: 9a8c9e70b2]
2025-04-23 20:43:03 +05:30
GunaShekar, Ajay 34d3f7022b SWDEV-523281 - CHANGELOG.md and negative test return values : hipLaunchKernelEx, hipLaunchKernelExC, hipDrvLaunchKernelEx (#155)
[ROCm/clr commit: 64d6f5714a]
2025-04-22 21:47:37 +05:30
Andryeyev, German f8344154a0 SWDEV-497841 - Enable memory manager by default (#149)
[ROCm/clr commit: a5c860f3b0]
2025-04-22 21:20:37 +05:30
Andryeyev, German 0569c7713c SWDEV-523300 - Add the new option to build HIP (#179)
Add the new cmake option AMD_COMPUTE_WIN  to build HIP on Windows
from the public github. AMD_COMPUTE_WIN should point to a special
repo with the PAL static libs

[ROCm/clr commit: a3effa16f1]
2025-04-22 21:05:04 +05:30
Hernandez, Gerardo ba5a9a5395 SWDEV-420237 - Fix reduce sync operations when masks are divergent (#181)
Do not use __ockl_activelane_u32() to calculate the index of the lane within the mask, as that would not work with divergent masks that have other bits on before the associated lane.

[ROCm/clr commit: 1a8d766836]
2025-04-22 19:47:58 +05:30
Godavarthy Surya, Anusha 41c4bea0f5 SWDEV-508538 - Optimize mem access and pack structure (#71)
Change-Id: Ib05b8891a6d228fc3266918a000d332fddc7438b

[ROCm/clr commit: bf28bbd9ab]
2025-04-21 13:43:25 +05:30
Brzak, Branislav 5df4734f71 SWDEV-526612 - Add missing copyright notices (#201)
[ROCm/clr commit: 99142c3dd9]
2025-04-18 20:54:27 +05:30
Ramirez, Lucas 0a45aa85c5 SWDEV-524612 - Consider "1" a truthy value for WGPMode (#187)
The compiler currently serializes the workgroup_processor_mode COMGR metadata boolean field as "0"/"1" instead of "false"/"true". Consider "1" a truthy value during parsing.

[ROCm/clr commit: d020598a0f]
2025-04-17 11:50:07 +02:00
Brzak, Branislav 6ae4c7278e SWDEV-525423 - In COMGR Loader don't open file if image is already mapped (#193)
[ROCm/clr commit: d00b2a0953]
2025-04-16 11:00:54 +02:00
Arandjelovic, Marko 963de868ae SWDEV-523137 - function ptrs should match across all devices (#171)
[ROCm/clr commit: 5fe080fd67]
2025-04-16 10:35:48 +02:00
Andryeyev, German a9df586812 SWDEV-459758 - Pass workgroup size explicitly (#185)
It's easier for compiler to move explicit kernel arguments into user SGPRs

[ROCm/clr commit: 3fd7650fe3]
2025-04-15 15:22:15 -04:00
Chaudhary, Jatin Jaikishan 7257b705ce SWDEV-512924 - add fp4 API (#52)
* Remove C-style include guard

* clean up issues in the PR


[ROCm/clr commit: 5d638d831c]
2025-04-15 17:53:50 +01:00
Xie, Pengda 9c1d61fbb0 SWDEV-518317 - Remove Redundant Error Message in removeFatBinary (#164)
[ROCm/clr commit: e92ea151b2]
2025-04-15 09:00:39 -07:00
Andryeyev, German db3c5f87ea SWDEV-526836 - Switch PAL backend to CmdReleaseThenAcquire() (#175)
[ROCm/clr commit: f6c804edc0]
2025-04-15 11:49:53 -04:00