Huang, AnZhong
b434fbe2bd
SWDEV-527299 - Support HIP_POINTER_ATTRIBUTE_CONTEXT ( #180 )
...
* SWDEV-527299 - Support HIP_POINTER_ATTRIBUTE_CONTEXT
As HIP enables UVA by default, it seems we can simply expose the context to support this feature.
2025-05-09 17:34:16 +08:00
Chaudhary, Jatin Jaikishan
2f73e1385b
SWDEV-525933 - add constexpr operators for fp16/bf16 ( #199 )
2025-05-09 09:53:58 +01:00
Jayaprakash, Karthik
fa55557f46
SWDEV-493805 - Cleaning up launch parameters arguments. ( #241 )
2025-05-06 15:06:13 -04:00
Dittakavi, Satyanvesh
607f8f26fd
SWDEV-529831 - Return error if the program is empty ( #257 )
2025-05-06 15:12:12 +05:30
Chaudhary, Jatin Jaikishan
a71c6eb1a0
SWDEV-529854 - __hmax/__hmin should handle nan's ( #246 )
2025-05-06 09:42:15 +01:00
Chaudhary, Jatin Jaikishan
b1ebf33850
SWDEV-529927 - add missing operations for fp16/bf16 ( #238 )
2025-05-06 09:41:21 +01:00
Guan, Zichuan
3775298655
Disable HIP_PLATFORM auto-detect if already defined ( #254 )
...
Co-authored-by: Stella Laurenzo <stellaraccident@gmail.com >
2025-05-05 15:37:53 -04:00
Arsenault, Matthew
1db9a7d48b
SWDEV-1 - Stop using ocml rounding functions ( #228 )
...
Directly use the builtins. Use the elementwise versions since there's
no implied errno, regardless of -f[no]-math-errno.
I didn't change the cases unnecessarily casting. The bfloat and vector
cases should work directly.
2025-05-05 19:35:12 +02:00
Searles, Mark
cd9bc61559
Fix typos in warning msgs ( #231 )
2025-05-02 14:31:42 -07:00
Chaudhary, Jatin Jaikishan
12febe6782
SWDEV-514560 - add fp6 header implementation ( #54 )
...
Co-authored-by: rahul manocha <rmanocha_amdeng>
2025-05-01 15:17:38 +01:00
Godavarthy Surya, Anusha
2538d7f02b
SWDEV-522841 - Graph nodes must be created/launched on device where they are captured/created ( #108 )
2025-04-29 22:20:39 +05:30
Jiang, Julia
eb62fe9f62
SWDEV-522634 - Fix device properties in hipInfo ( #203 )
2025-04-29 11:29:47 -04:00
Sang, Tao
96cadbc9e9
SWDEV-520352 - Remove HostThread and legacy monitor ( #230 )
...
* SWDEV-520352 - Remove HostThread and legacy monitor
Remove HostThread, semaphore and legacy monitor.
Make original logics of thread and command queue stricker.
Add more comments to make logics clearer.
Some other minor improvement.
Also part of SWDEV-458943.
2025-04-29 09:55:24 -04:00
Jayaprakash, Karthik
b2388dfb88
SWDEV-506467 - Skip Abort in case of crash from the device. ( #60 )
...
Change-Id: I964b2f2647d068202e9c38fcddb1337da754df8d
2025-04-29 11:19:02 +05:30
Betigeri, Sourabh
9cf3f1e461
SWDEV-528351 - Removes unused code and asserts to improve coverage ( #219 )
2025-04-28 14:40:35 -07:00
Godavarthy Surya, Anusha
fb92683d86
SWDEV-469422 - Avoid using of hipStream_t in internal methods ( #69 )
...
Change-Id: Ifd5362f371c846a88241927383cb95cf046548ef
2025-04-28 15:09:11 +05:30
Godavarthy Surya, Anusha
bbcb1f9c70
SWDEV-469422 - hipGraphNodeDOTAttribute change std::string members to const char* ( #70 )
...
Compiler creates global variables for every unique string
Change-Id: I4cf8dd3e763d16740096e345da67a7ef72f61515
2025-04-28 14:57:36 +05:30
Assiouras, Ioannis
1099e0a131
SWDEV-526188 - Fix race condition in StatCO::getStatFunc()
...
Make sure that a newly created FatBinaryInfo is assigned to modules only after extractFatBinary has been called for the object.
2025-04-27 21:14:01 +01:00
Sang, Tao
1113eff3f9
SWDEV-493275 - Support scratch limit ( #20 )
...
Support programmatic query and change of scratch limit on
AMD devices.
Change-Id: Id5da355a77366f97868e462847f3916e87fd2af6
2025-04-24 17:15:25 -04:00
Godavarthy Surya, Anusha
e5ce544c45
SWDEV-469423 - hipStreamEndCapture graph* can be nullptr ( #170 )
2025-04-24 13:57:09 +05:30
Sang, Tao
27aad09bd4
SWDEV-518831 - fix streams' sync issue in mthreads ( #123 )
...
* SWDEV-518831 - fix streams' sync issue in mthreads
1. Fix sync issue of null stream and non-null streams in
multithreads.
2. Remove assert(GetSubmissionBatch() == nullptr) as it
is invalid in multithreads.
3. Update getActiveQueues() to deal with the state of
being terminated.
2025-04-23 15:08:07 -04:00
GunaShekar, Ajay
64d6f5714a
SWDEV-523281 - CHANGELOG.md and negative test return values : hipLaunchKernelEx, hipLaunchKernelExC, hipDrvLaunchKernelEx ( #155 )
2025-04-22 21:47:37 +05:30
Hernandez, Gerardo
1a8d766836
SWDEV-420237 - Fix reduce sync operations when masks are divergent ( #181 )
...
Do not use __ockl_activelane_u32() to calculate the index of the lane within the mask, as that would not work with divergent masks that have other bits on before the associated lane.
2025-04-22 19:47:58 +05:30
Godavarthy Surya, Anusha
bf28bbd9ab
SWDEV-508538 - Optimize mem access and pack structure ( #71 )
...
Change-Id: Ib05b8891a6d228fc3266918a000d332fddc7438b
2025-04-21 13:43:25 +05:30
Brzak, Branislav
99142c3dd9
SWDEV-526612 - Add missing copyright notices ( #201 )
2025-04-18 20:54:27 +05:30
Brzak, Branislav
d00b2a0953
SWDEV-525423 - In COMGR Loader don't open file if image is already mapped ( #193 )
2025-04-16 11:00:54 +02:00
Arandjelovic, Marko
5fe080fd67
SWDEV-523137 - function ptrs should match across all devices ( #171 )
2025-04-16 10:35:48 +02:00
Chaudhary, Jatin Jaikishan
5d638d831c
SWDEV-512924 - add fp4 API ( #52 )
...
* Remove C-style include guard
* clean up issues in the PR
2025-04-15 17:53:50 +01:00
Xie, Pengda
e92ea151b2
SWDEV-518317 - Remove Redundant Error Message in removeFatBinary ( #164 )
2025-04-15 09:00:39 -07:00
Chaudhary, Jatin Jaikishan
fcaefe97b8
SWDEV-509213 - make cmake_minimum_required consistent across clr ( #51 )
...
Change-Id: Ib0b1df7af8984a37d6bf7ca68ec99597d5978821
2025-04-15 15:23:41 +05:30
Chaudhary, Jatin Jaikishan
588cf0fc69
SWDEV-520627 - include warp functions header for warpSize ( #177 )
...
Change-Id: Id3fff8f2722d521071ef0ff71b09fc365ef6fa82
2025-04-15 14:40:27 +05:30
Chaudhary, Jatin Jaikishan
07e57a1f0d
SWDEV-517941 - use device bitcode before spirv ( #95 )
...
Also add flag: HIP_FORCE_SPIRV_CODEOBJECT to allow override to force use
SPIRV.
* use cache for already compiled code objects
* address review comments and use the two spirv isa names
2025-04-14 23:40:52 +01:00
Milanov, Aleksandar
c4fa3ef927
SWDEV-526208 - Fix miscalculation of coalesced tiled partition mask ( #162 )
2025-04-11 19:40:26 +02:00
Hernandez, Gerardo
66496258b4
SWDEV-521920 - Fix compilation issues introduced by the reduce sync operations - 2 ( #167 )
...
Fix pytorch 2.5 issues, by defining reduce sync operations for type __half in amd_hip_fp16.h and not in
amd_warp_sync_functions.h which is problematic in case __half does not get included before that header.
Only define types not supported by cuda if HIP_ENABLE_EXTRA_WARP_SYNC_TYPES is defined, to avoid portability issues
2025-04-11 17:00:59 +05:30
Haehnle, Nicolai
199b0f1086
Report null stream creation failure ( #152 )
...
Explicitly nulling the pointer causes us to report the error below
instead of keeping a dangling pointer around that will most likely lead
to a subsequent segfault.
2025-04-10 11:40:05 -07:00
Sang, Tao
6d10577761
SWDEV-521083 - Fix atomicMin/Max issues ( #151 )
...
Fix atomicMin/Max(), atomicMin/Max_system() issue on
float types.
2025-04-10 12:30:55 -04:00
Chaudhary, Jatin Jaikishan
628777b73d
SWDEV-461087 - fp4/fp6/fp8 ocp headers ( #41 )
...
This now has host conversions too, which is directly from Christopher's
work on fcbx.
Signed-off-by: Christopher M. Riedl
* add const to func parameter
* do not depend on builtins, use gfx950 detection
2025-04-10 17:22:15 +01:00
Xie, Jiabao(Jimbo)
0d6e554d92
SWDEV-524188 - Check for VRam and system RAM properly ( #122 )
...
Currently, we check if there's enough system RAM even if we don't allocate on host device. This is incorrect logic.
We should not check for this size on windows because PAL checks for memory allocation. See SWDEV-467263.
Co-authored-by: Jimbo Xie <jiabaxie@amd.com >
2025-04-10 21:50:48 +05:30
Chaudhary, Jatin Jaikishan
5c030840d6
SWDEV-520627 - include sync_warp header instead of warp function header ( #18 )
...
Change-Id: Ic3f54b0f5bfee8565a8bbb6218fb0ccdb900c9ea
2025-04-10 21:50:25 +05:30
Patel, Jaydeepkumar
8531cd3bbe
SWDEV-508973 - If total # of threads/block is more than HW capacity, it's invalid config issue and should return invalid config error. ( #25 )
2025-04-10 15:16:16 +05:30
Stojiljkovic, Vladana
e91cb4f320
SWDEV-505795 - Return the same ptr from hipIpcOpenMemHandle if it is called multiple times ( #93 )
...
* SWDEV-505795 - Return the same ptr from hipIpcOpenMemHandle if it is called multiple times
* Move initialization outside of if statement
2025-04-10 11:20:36 +02:00
Chaudhary, Jatin Jaikishan
5214d1ca07
SWDEV-525969 - add gfx950 to use ocp type for fp8 ( #157 )
...
* do not use __gfx94plus_clr__ macro in fp8 header
2025-04-09 16:21:39 +01:00
Stojiljkovic, Vladana
bc474ea5af
Set maxTexture2DLinear fileds in deviceProp ( #89 )
2025-04-09 17:13:49 +02:00
Brzak, Branislav
b006380ff6
SWDEV-525653 - Make hipGetDeviceProperties and hipChooseDevice use the new API ( #159 )
2025-04-08 18:54:05 +02:00
Andryeyev, German
e974f7fde1
SWDEV-497841 - Add VmHeapArray support ( #76 )
...
Add VmHeapArray class to reduce the pressure on VA reservation, since
multiple memory pools can be active at the same time.
2025-04-03 21:04:18 +05:30
Andryeyev, German
3514f45544
SWDEV-524849 - Fix HIP error returned during capture ( #141 )
...
Always use the latest dependent nodes during hipEventRecord capture
2025-04-03 20:08:25 +05:30
Betigeri, Sourabh
8c6b90996e
SWDEV-523281 - [clr] Implementation of hipLaunchKernelExC and hipDrvLaunchKernelEx API with support for cooperative launch ( #92 )
2025-04-03 20:10:05 +09:00
Sang, Tao
8d90b44a1b
SWDEV-508863 - Support generic target in compressed fatbin ( #44 )
2025-03-27 20:13:51 +05:30
Belton-Schure, Aidan
ded41058a0
SWDEV-515426 - Use RAII classes for comgr ( #28 )
...
Change-Id: I9f6005542cc88f1e16e22741dcc0ce904fdaa2b0
2025-03-25 20:10:44 +05:30
Dittakavi, Satyanvesh
376f23b86a
SWDEV-516595 - Add __shfl functions with __hip_bfloat16 datatype ( #42 )
...
Also removes asserts in cooperative groups shfl functions since
__hip_bfloat16 shfl is present now
Change-Id: I57578b6e68dccc10c2ddcd194e9cc18bc7732ce1
2025-03-25 15:38:01 +05:30