Revīziju grafs

12882 Revīzijas

Autors SHA1 Ziņojums Datums
Ioannis Assiouras cffff4e1cb SWDEV-457859 - Initialize isWGPMode_ in WorkGroupInfo
Change-Id: Ie3f3c0bcea84368c1b0607fd52b4bc7cae41c512
2024-04-25 16:36:55 -04:00
shadi f2b01782ac SWDEV-420016 - Add more driver side graph APIs
Signed-off-by: shadi <shadi.dashmiz@amd.com>
Change-Id: Iff3ee7dcbcd24836f227fdc9bd5ff4b554ac914f
2024-04-25 12:50:43 -04:00
German Andryeyev 9fdddb7c5d SWDEV-447691 - Correct handle type for DX12 semaphore
Change-Id: Id23882286cb2a0d0472964ffc501ab27b7dc7f00
2024-04-25 11:24:56 -04:00
Ioannis Assiouras 2841aab017 SWDEV-451099 - Added include for __half type definitions for non-HIP code on windows
Change-Id: Id80cef5a36db8707276de052cbaf73b6826d222f
2024-04-24 15:31:31 -04:00
German Andryeyev 5c23440199 SWDEV-353281 - Align VA size
Lower layer ignores alignment

Change-Id: If16df951ecefddc804a6effe013058afc595d30f
2024-04-24 15:22:20 -04:00
Julia Jiang 1761f1b7f5 457619 - Fixed the broken link to build HIP instructions
Change-Id: Ica87b4ab511d26e0372502f069afc0e3baaa3256
2024-04-24 11:41:07 -04:00
Rahul Manocha 880963346d [SWDEV-454661][SWDEV-454653] - GraphExecMemcpyNodeSetParam to return error on memcpy direction change
Change-Id: I2c8f5ea394caeaaa6895003e63cd62a052c491f8
2024-04-23 12:56:30 -04:00
Konstantin Zhuravlyov 5a715ed160 Switch luxmark to lightning compiler for all ASICs
Change-Id: Idcd37628a2167f0bd2db2a83132a1862cbd051b0
2024-04-23 10:00:39 -04:00
kjayapra-amd 74ffc5f0d5 SWDEV-413997 - Cleanup fixes for Virtual Memory Management.
Change-Id: I9a4a4d9087b5daf15e3ba31e786d34db431212a1
2024-04-22 10:58:06 -04:00
German Andryeyev 0ccdb3e160 SWDEV-440746 - Release last command on terminate
Change-Id: Ib6a9b8fc9a8692eb17b39b854cefd92c6b59733f
2024-04-22 09:57:38 -04:00
German Andryeyev 7448113cfc SWDEV-440746 - Remove obsolete code
The "optimized" version of memcpy is outdated and
was used in win32 only.

Change-Id: I7f2e0e9051e37cec95438266824b5b0025c324c6
2024-04-22 09:56:42 -04:00
kjayapra-amd 863c56262e SWDEV-455041 - Continue processing fat binary even if other code object bundle processing fail.
Change-Id: Iea553ab0265c08341f915644075ce2b6ed9b3200
2024-04-20 14:25:49 -04:00
Rakesh Roy fb217fa9e0 SWDEV-453180 - Add UUID support for HIP_VISIBLE_DEVICES on Windows
- UUID needs to be specified in the format GPU-<body>, <body> encodes UUID as a 16 chars
- Convert set UUID in HIP_VISIBLE_DEVICES to device index internally
- Then use existing device index logic for HIP_VISIBLE_DEVICES

Change-Id: I654f492a49cd4d7a9b7339360ab558165240caa5
2024-04-20 02:39:19 -04:00
German Andryeyev 329ba271fa SWDEV-440746 - Wait for signal before release
Change-Id: I9e2aefdbcbba153c7f1080d80aab7a345eaf1eb4
2024-04-19 18:33:28 -04:00
German Andryeyev ffb516db3e SWDEV-353281 - Reuse timestamp on memory reuse
Mempool may reuse memory without a wait. Hence, the timestamp has
to be preserved and can't be destroyed.

Change-Id: I6f095f44afa69887a4b7aeb3b329804aedd96f3e
2024-04-19 18:00:29 -04:00
German Andryeyev fd81490bb8 SWDEV-440746 - Don't set CL_SUBMITTED twice
Change-Id: I9ba34454f7487d6bc0d398b322a147cbac6c6443
2024-04-19 17:36:51 -04:00
Satyanvesh Dittakavi 8f7acbdadb SWDEV-446610 - Attribute HIP_POINTER_ATTRIBUTE_SYNC_MEMOPS should return the correct value
Change-Id: Ieced2ee61bba28f2d1df96893a661287b0a5c7b7
2024-04-19 14:40:09 -04:00
Ioannis Assiouras bf74ef4025 SWDEV-451594 - Implement Readback and Avoid HDP Flush workaround for device kernel args
Change-Id: I6d41a089a17f55306e7ff402588a1e831b20a7a7
2024-04-19 09:29:20 -04:00
Anusha GodavarthySurya e829ef68e4 SWDEV-455869 - Revert "SWDEV-410751 - Consider null amd::memory is invalid."
This reverts commit a9ff2c5a43.

Change-Id: I26c4b3c74b2861afc17f979492d025b59d4388ab
2024-04-19 00:54:26 -04:00
kjayapra-amd 56ebf5157a SWDEV-413997 - VMM IPC implementation for Linux.
Change-Id: Icfeb83ca51e96be35abb67a94d6e3e1a1ca5a934
2024-04-18 11:28:13 -04:00
Anusha GodavarthySurya 8179fa98a2 SWDEV-450053 - Handle MemcpyNodeSetParamsTo/FromSymbol negative parameters
For all windows allocation on SVM memory tagged with flag ROCCLR_MEM_INTERPROCESS.
hipHostMalloc validation is based on flag. So remove ROCCLR_MEM_INTERPROCESS before check.

Change-Id: I823bbf228d9a4a9acb4abffc01ac6b3f544c6e12
2024-04-18 05:39:35 -04:00
Jaydeep Patel 12e0bdcd32 SWDEV-453535 - Capture hipMemset3DAsync.
Change-Id: I517c2557573db258b3e3e353f02f6a56652b0fde
2024-04-18 00:05:45 -04:00
Jaydeep Patel 8942939fac SWDEV-455346 - End wait if HostcallListener terminates.
Change-Id: I21ec8eadb189147c579ec65acf68de40d604686b
2024-04-18 00:04:00 -04:00
German Andryeyev 62559a6e5a SWDEV-440746 - Fix the hostcall buffer creation
Avoid a deadlock on the host call buffer creation. Since the buffer will be
allocated in the queue thread, then use direct device memory allocation
skipping the global context lock.

Change-Id: I09b55ee03bb42ab5d320c152b52a8c842c5fdcc1
2024-04-17 12:37:23 -04:00
sdashmiz d511e57257 SWDEV-441603 - Correct dst device
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: Ie60aa598dd73df66cdf02c1d96daf2dfccba7a59
2024-04-17 09:21:06 -04:00
Jatin Chaudhary d7b0d78fad SWDEV-379007 - fix bool check for fp8_fnuz
for fnuz nums zero val is 0x00, -0 i.e. 0x80 would be a NaN

Change-Id: Ibdc4fb4b9fb307b5952434f08d45a8ddd6262db8
2024-04-17 05:31:21 -04:00
Sameer Sahasrabuddhe 03562a2547 SWDEV-454959 - ignore the upper half of the mask in wave32 mode
Change-Id: If027dd8cbe5cbe142fff353cb72c16f08e9aea8e
2024-04-17 10:12:57 +05:30
pghafari 5ddca5854c SWDEV-455699 - removing HW_REG_TRAPSTS for gfx12
Change-Id: I5f8b030eefdb37d3d51da3e135e5aa0f18ad9018
2024-04-16 19:46:21 -04:00
Jatin Chaudhary 49349f168c SWDEV-379007 - use avx instruction for bf16 cvt
AMD CPUs have had avx512_bf16 support for quite some time now (from
consumer Ryzen 7000 series to enterprise grade CPUs). This
patch should allow users to use the hardware bf16 unit when running the
__host__ variants of the function. This can be enabled via `hipcc ...
-mavx512vl -mavx512bf16`.

Change-Id: I67c377afc95ddfe8d45a048dce078a247d4a1878
2024-04-16 18:35:08 -04:00
German Andryeyev c95a75a2bf SWDEV-444670 - Enable teardown class
Force implicit runtime teardown with a global destructor.

Change-Id: Iabe63dedf5b94fefc98668585c45a61607120669
2024-04-16 12:00:06 -04:00
kjayapra-amd a1e0970d6d SWDEV-422580 - Adding back the pcie.function to PCI address string in hipGetDevicePCIBusId.
Change-Id: I932724cc872d7ae2643ce6ac2924901cb49cd7ad
2024-04-16 07:28:48 -04:00
Jatin Chaudhary ca07f59fb1 SWDEV-379007 initial implementation of fp8 header
Change-Id: Id9a5a85641882961e4d860a815217c641e6f3387
2024-04-16 05:37:59 -04:00
Sourabh Betigeri fcfe2ec88b SWDEV-453577 - Fixes to account for right CU count based on WGP or CU mode
Change-Id: Ib9739f9917bc6ff69cc76f444d909311922ebc1e
2024-04-15 11:53:43 -04:00
kjayapra-amd 00ddc3e284 SWDEV-413997 - Fixing alignment validation check for power of 2 instead of granularity factor.
Change-Id: I1e0db6e0628c09d26850e5a0339e2a4660442db8
2024-04-15 09:45:29 -04:00
kjayapra-amd 815e450cfd SWDEV-413997 - Read Access can be valid now that ROCr takes care of access.
Change-Id: Iecda74ca0207c95d3fbed8b4e15c8c4c5895d939
2024-04-15 06:00:14 -04:00
Rakesh Roy 52db98edd9 SWDEV-453180 - Add UUID support for HIP_VISIBLE_DEVICES on Linux
- UUID is Ascii string with a maximum of 21 chars which uniquely identifies a GPU
- Convert set UUID in HIP_VISIBLE_DEVICES to device index internally
- Then use existing device index logic for HIP_VISIBLE_DEVICES

Change-Id: I8cab4fe42459f8209b97f909300789e6e687b9ac
2024-04-13 22:07:19 -04:00
kjayapra-amd d52d16c8e6 SWDEV-413997 - Fixing multiple device cases.
Change-Id: I10ad3fbfca887e92cd81f68392fa1acf753cbd2b
2024-04-13 06:14:03 -04:00
German Andryeyev 7de7da4016 SWDEV-455254 - Reduce blit kernels signature
Remove offset from blit kernels, since it can be applied in setup.

Change-Id: I06b585068d68a0ee8e125ddf46a36fccb372f30d
2024-04-12 14:45:55 -04:00
taosang2 35c80dd482 SWDEV-424956 - Fix half vector printf issue
Refactor PrintfDbg::outputArgument() to remove potential risk.
Fix half vector printf issue on all devices.
Fix FEAT-56794 as well.

Change-Id: Iae39359d2128588def2e43d77fe58e868b8e71ff
2024-04-12 14:25:44 -04:00
Jaydeep Patel d52168b46d SWDEV-436754 - Use glbctx instead so that ref count increments for multi devices and chunk decommit gets delayed.
Change-Id: Ia4b0d5fbfa8f198776e52d14de8b22c6942f740d
2024-04-12 00:04:34 -04:00
German Andryeyev f0c7ecf617 SWDEV-455254 - Add kernel arg optimization
Add kernel arguments optimization into blit path.
Enabled by default on MI300.

Change-Id: I2694a81b90d48ad07d86dfe4c0c64fe187bada8e
2024-04-10 18:08:37 -04:00
kjayapra-amd 2b8634bada SWDEV-446298 - Adding error code to the logs on p2p hsa api failure.
Change-Id: Ic41b1ad1b64cca0e31986337a83a5146d52a7328
2024-04-10 06:00:00 -04:00
Jatin Chaudhary 481912a1fd SWDEV-379007 - add __hip_bfloat16_raw types
This also brings bfloat16 implementation closer to CUDA's.

Change-Id: I23f381141faacd6537923ae9b88ada4d661db496
2024-04-09 05:32:13 -04:00
Saleel Kudchadker 3f0bcf7834 SWDEV-301667 - Fix SDMA mask reuse
If we are using the mask returned by getLastUsedSdmaEngine() then we
need to apply the SDMA Read/Write mask to it before using with HSA
copy_on_engine API.

Change-Id: I6e5dc6c187eeb3c61ee159e9d2a0fa7b4737c06e
2024-04-08 15:42:52 -04:00
Sourabh Betigeri dbac2976e4 SWDEV-451964 - Limit gpu single allocation percentage for gfx940 only
Change-Id: Iadcdadd734e7aeeb23742e426353defa972d3ad5
2024-04-05 09:43:42 -04:00
Ioannis Assiouras d7f352dbed SWDEV-453301 - Remove the option to write multiple packets in dispatchGenericAqlPacket
Dispatching multiple packets with ring the doorbell once is not supported by the lower layers

Change-Id: I7665a2dcdd4ef9e47dadfe410180fed64c5a4ee0
2024-04-05 05:28:10 -04:00
Rakesh Roy 880f1f0049 SWDEV-450361 - Add nullptr validation for waitStream
- Application is passing null for parameter stream in API hipStreamWaitEvent
- When event stream isn't capturing and event is not recorded, causes segfault because we are accessing deviceId() from waitStream

Change-Id: I8b87ffd6f234677f68b66dcb7ef44b2ff04a7c91
2024-04-04 02:07:18 -04:00
cadolphe bc80802c1a SWDEV-446726 - Disable large bar for 32 bit windows
When large bar is enabled, persistent memory leads to overallocation for 32 bit architecture.

Change-Id: Iae39359d8128588de02e42d77fe58e868b8e71fd
2024-04-03 15:36:41 -04:00
cadolphe f7b1398361 SWDEV-443537 - fix make build warning message
Add cltrace compile definition for CL_TARGET_OPENCL_VERSION to OpenCL 2.2

Change-Id: Ie868ab0a6e86951afc6d07da58be942c3b736d15
2024-04-02 16:42:01 -04:00
cadolphe 411960a131 SWDEV-451687 - Fix alloc message values in AMD_LOG_LEVEL for 32 bit
Change-Id: Icbe67024297c92bf59139b6a2ccd2ba3674f60b1
2024-04-01 13:32:20 -04:00