نمودار کامیت

2769 کامیت‌ها

مولف SHA1 پیام تاریخ
Tony Gutierrez 3b30b8a975 rocr: Remove KMT usage from AMD ext
Use the core Driver in AMD's HSA extension API to make it
agnostic to the underlying OS and kernel-mode driver.


[ROCm/ROCR-Runtime commit: d3a4dc9687]
2025-02-25 21:51:52 -05:00
James Zhu b42578b070 kfdtest: fix resource leakage
Resource allocated in SetUp/HsaNodeInfo::Init,
needs be delete in TearDown/HsaNodeInfo::Delete.

Signed-off-by: James Zhu <James.Zhu@amd.com>


[ROCm/ROCR-Runtime commit: f8d8b8011f]
2025-02-24 19:38:59 -05:00
Khatri, Shweta e00c926d27 rocr: Adding support for Stochastic PC Sampling for gfx94x (#47)
Change-Id: Ide4c2e25b88f1f25ea4ce35a619b93963c0355ee

[ROCm/ROCR-Runtime commit: 322a794cf6]
2025-02-22 00:13:08 -05:00
Tony Gutierrez 727159b4db rocr: Remove KMT usage from CPU agent
Use the core Driver object in the CPU agent to make it OS/driver
agnostic.

Implement the GetMemoryProperties() and GetCacheProperties methods
for the KFD driver.


[ROCm/ROCR-Runtime commit: a9f6bc8d0e]
2025-02-21 10:00:38 -05:00
Cheruvally, Aravindan 69c014290d Enable/Disable rocprofiler-register pkg dependency based on build type (#30)
Co-authored-by: Yat Sin, David <David.YatSin@amd.com>

[ROCm/ROCR-Runtime commit: 20e6c87a09]
2025-02-20 11:07:35 -05:00
David Yat Sin 2dcc1989bc rocr: Add queries for async scratch reclaim
Add support for these 2 new queries:
- HSA_AMD_AGENT_INFO_SCRATCH_LIMIT_MAX
  Maximum amount of scratch memory allowed on this agent

- HSA_AMD_AGENT_INFO_SCRATCH_LIMIT_CURRENT
  Current limit for scratch memory on this agent


[ROCm/ROCR-Runtime commit: 107b48fb15]
2025-02-19 21:02:00 -05:00
David Yat Sin 5905b82579 rocr: Update for new async scratch reclaim
Updating ROCr code to match new handshake protocol with CP FW for
asynchronous scratch reclaim.
Increase previous limits when scratch reclaim feature is available.


[ROCm/ROCR-Runtime commit: aa2f98e6f9]
2025-02-19 21:02:00 -05:00
David Yat Sin a0903ecc7a rocr: Remove unused fields in amd_queue_t
scratch_wave64_lane_byte_size and alt_scratch_wave64_lane_byte_size are
not used by CP FW.


[ROCm/ROCR-Runtime commit: 2f8a9b28d0]
2025-02-19 21:02:00 -05:00
David Yat Sin 1474a6c774 rocr: Remove gfx940 and gfx941 support
[ROCm/ROCR-Runtime commit: 13c591d250]
2025-02-19 12:16:24 -05:00
David Yat Sin 99e040e730 rocrtst: extend IPC test to support async_handler
[ROCm/ROCR-Runtime commit: 806ddfc8eb]
2025-02-19 11:19:09 -05:00
David Yat Sin 65686b9a0a rocr: Allow IPC signals in hsa_amd_signal_async_handler
Allow IPC signals to be registered with hsa_amd_signal_async_handler.
This forces AsyncEventsLoop to switch to polling instead of interrupts.


[ROCm/ROCR-Runtime commit: fa8be44df9]
2025-02-19 11:19:09 -05:00
Longlong Yao 082c6b7830 libhsakmt: allocate va in host path
Change-Id: I40a4395aca99ea8dfd8ff0ecde64eb2c3840d867
Signed-off-by: Longlong Yao <Longlong.Yao@amd.com>


[ROCm/ROCR-Runtime commit: 26f001d3cb]
2025-02-15 07:56:45 -05:00
Adel Johar fcd8d9795b Docs: Update environment variables page
[ROCm/ROCR-Runtime commit: b4f8b5c202]
2025-02-14 10:15:20 -05:00
Harish Kasiviswanathan 729f98b05f libhsakmt: gfx950: Add option to enable HIGH_PRECISION
Environment variable HSA_HIGH_PRECISION_MODE can be used to control MFMA
precision

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: Ib78dd9dd8867025e090a3cca96ab6db4f65dea12


[ROCm/ROCR-Runtime commit: 2a64fa5e06]
2025-02-10 16:05:25 -05:00
Ranjith Ramakrishnan e8b8e92ce3 CMake: Add package conflict for the deprecated package hsakmt
For debian use cases, package conflict is required to remove the
deprecated package during package upgrade Also removed the duplicate
setting of package obseletes in RPM usecase.


[ROCm/ROCR-Runtime commit: 3be9c49b63]
2025-02-07 11:57:32 -05:00
Saleel Kudchadker d5f08e6fa8 rocr: Skip uSleep for non-interrupt signals
- When waiting on non-interrupt signals, do not uSleep. This causes
  regressions compared to interrupt signal usage.
- Cleanup code.

Change-Id: I706bda0b13e64ffec0b607c1915d8380a2ce0dea


[ROCm/ROCR-Runtime commit: 890399a7cf]
2025-02-06 23:48:35 -05:00
Luna Nova 9a0f0858fa rocr: set underlying type of hsa_region
Set underlying type of hsa_region_info_t, hsa_amd_region_info_t
to int.

Change-Id: Ibf97a025eec6176d8e28af8009e9bd6795ca061f


[ROCm/ROCR-Runtime commit: 166b08346b]
2025-02-06 16:25:03 -05:00
Choudhary, Rahul cfcb5a9c4d Update rocm_ci_caller.yml to use amd-master (#11)
Update rocm_ci_caller.yml to use amd-master , until amd-mainline is aligned

Signed-off-by: Choudhary, Rahul <Rahul.Choudhary@amd.com>

[ROCm/ROCR-Runtime commit: 16cd712685]
2025-02-04 12:46:10 -08:00
Choudhary, Rahul 3842fe1e25 Update rocm_ci_caller.yml added amd-npi pull request trigger
[ROCm/ROCR-Runtime commit: 7c03610905]
2025-01-31 16:10:41 -08:00
Choudhary, Rahul 751ebdfc0e Create rocm_ci_caller.yml
[ROCm/ROCR-Runtime commit: c603d7164c]
2025-01-31 14:25:18 -08:00
Choudhary, Rahul e4e3c59968 Create kws_caller.yml
[ROCm/ROCR-Runtime commit: 460a28ed03]
2025-01-31 14:22:03 -08:00
sonadeem 02edf09f87 cmake: Fix BUILD_SHARED_LIBS option and README for it
BUILD_SHARED_LIBS is a global flag so we don't need to set a default
option for it in both libhsakmt and hsa-runtime, only the top level
CMakeLists file. Also updated README to reflect that libhsakmt is
always built statically and gets linked to libhsa-runtime.

Change-Id: I1511f68a268032bec9758bc731d8074f33ec980f


[ROCm/ROCR-Runtime commit: ff01f62777]
2025-01-30 14:17:27 -05:00
David Belanger 75a060fc53 kfdtest: Convert ExtendedCuMask test to multi-GPU framework
Convert test to use multi-GPU framework.

Add mutex to fix intermixed log issue and annotate logging with
gpu node number.

Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: Ic2beeadb1eb4b5a9a0710ac1dbd60b9bf1d84c33


[ROCm/ROCR-Runtime commit: f24d789dee]
2025-01-30 11:41:00 -05:00
David Yat Sin a4ca610829 Set default build to shared libraries
Change-Id: I73eeac3652a69a71b2c0dc2daeabc9af2c0cfd14


[ROCm/ROCR-Runtime commit: 0aebe7f3d0]
2025-01-29 21:24:59 +00:00
Sv. Lockal d1507361ec Fix build issues for musl libc (#267)
Change-Id: Ia31330b0f96669966712b58986abeca754c2cbb9


[ROCm/ROCR-Runtime commit: 5d04bd42f3]
2025-01-29 14:31:05 +00:00
Lang Yu 85125b1054 kfdtest: update AtomicIncIsa for gfx12
"s_waitcnt 0" (deprecated in gfx12) is redundant here.

s_endpgm will wait for all outstanding instructions
to complete before executing.

Change-Id: Ia8b4dd0fd8dd713e7ba2cba9db85b7b12cee1dd4
Signed-off-by: Lang Yu <lang.yu@amd.com>


[ROCm/ROCR-Runtime commit: d159b29dc6]
2025-01-28 20:32:41 -05:00
Yiannis Papadopoulos 71165b9460 rocr: Changing to a device SVM flag
Change-Id: Ib085801d23604eeef0a17a05cf2b298170fb3d24


[ROCm/ROCR-Runtime commit: 03bd4c9508]
2025-01-28 17:06:16 +00:00
Yiannis Papadopoulos f107054dbe rocr: Use SVM information to separate dev heap
Use SVM information from user accessible memory

Change-Id: I8fad37eb1a90dc1f5827a096552130a3fd6187f4


[ROCm/ROCR-Runtime commit: 144e7674d1]
2025-01-28 17:05:52 +00:00
Min Zhou ee1ff92026 rocr: delete duplicated conditional expression
Change-Id: Idc8b1a8ca2975f33191a448f03cabf3fc4f8f8a6


[ROCm/ROCR-Runtime commit: a82f2f3134]
2025-01-28 10:48:44 -05:00
Yiannis Papadopoulos 17807d78bd rocr/aie: AIE agent memory pools correct size and user data pool
Change-Id: I831711a7d1cdc36cbc9ed30bd74d0dc984228ce7


[ROCm/ROCR-Runtime commit: 1d8a77db34]
2025-01-28 10:48:16 -05:00
James Zhu 647b705679 libhsakmt: increase default svm.alignment_order
Since GFX950 can support page table fragment up to 18 without
performance loss. So set GFX950  default svm.alignment_order to 18.

Change-Id: Ibcdb7f041fb07a38e924c471beec261ea227ca1d
Signed-off-by: James Zhu <James.Zhu@amd.com>


[ROCm/ROCR-Runtime commit: 9509af4b98]
2025-01-28 08:27:19 -05:00
Amber Lin e262729f6f kfdtest: Create gfx950 blacklist
This patch creates the blacklist for gfx950 by copying gfx942 but adding
KFDGWSTest.Semaphore as GWS support is completely removed from gfx950.

Change-Id: I5d7c17e57b8cfd9fae63780ecc9dd55662cfdade
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: 0b6e457201]
2025-01-28 08:26:44 -05:00
Yiannis Papadopoulos 428cc5b47c rocr/aie: Add dma-buf import support for AIEAgents via the Driver interface
Change-Id: I70f8d8772dda7c06944d75042cb3034ddd89aff4


[ROCm/ROCR-Runtime commit: 26bfa0b8f6]
2025-01-27 15:22:46 -05:00
Lancelot Six a95209dde6 libhsakmt: gfx950 uses same VGPR block size as gfx940
Make sure to use allocate the same amount of size for VGPR data in
gfx950 as it is done for gfx940.

Change-Id: I6a0820996389627ccbdfef856e5150c46fac92a1
Signed-off-by: Lancelot SIX <lancelot.six@amd.com>


[ROCm/ROCR-Runtime commit: 76052ba028]
2025-01-27 14:06:42 -05:00
Shweta Khatri 4325142db1 rocr: Use view3dAs2dArray flag, for thick/3D swizzle modes.
Added HSA_IMAGE_ENABLE_3D_SWIZZLE_DEBUG environment flag to
enable/disable this. Default value is false (view3dAs2dArray = 1)
Enabling this flag will enable support for swizzles that do 3D
interleaving. Note that all features of 3D images are supported
with 2D swizzles,it's just that the access patterns are different
and therefore cache hit-rates may be better or worse, depending
on how it's used. Volumetric algorithms do better with 3D and apps
that tend to access a single slice at a time do better with 2D.

Change-Id: Id8574a6710fe4333a1ee331e5ce9195a81434198


[ROCm/ROCR-Runtime commit: 6361466baa]
2025-01-27 09:28:33 -05:00
Tony Gutierrez ff52d6fc13 rocr: Add WaitMultiple to core Signal
Replaces WaitAny with WaitMultiple to more closely align with the
underlying driver API for waiting on multiple events.

WaitMultiple adds a single parameter, wait_on_all, to the WaitAny
interface providing a single function for waiting on multiple
events when we only need AND and OR semantics for the signal
checking logic.

Change-Id: I68a4a45d48151d9d69aef02fd8f7263b9e6c0e75


[ROCm/ROCR-Runtime commit: 8a38f121ea]
2025-01-27 09:21:43 -05:00
Lancelot Six c7b1fd714e libhsakmt: Use the node info to determine LDS size
The CWSR area size needs to take into account the size of LDS each
active workgroup can have.  The current implementation uses a constant
for that.  This patch refactors this to use the HsaNodeProperties of the
device's the CWSR area is for to figure out the size of LDS.

Change-Id: Ib8585b2b7140ec5c99e7b7d62e67f785697c028a
Signed-off-by: Lancelot Six <Lancelot.Six@amd.com>
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: c51aa0d155]
2025-01-26 21:46:32 -05:00
Alex Sierra da483d7588 kfdtest: add support for gfx9.5.0 in shader store
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I48b98ff631bd1aa1a044b60583ff256e43b17423


[ROCm/ROCR-Runtime commit: 268054cd28]
2025-01-26 21:45:07 -05:00
Alex Sierra 840a613723 kfdtest: Add gfx 9.5 as FAMILY_AV
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: Ib5696eee1d4f64c9c87d714eae7c80fbbd1e2b23


[ROCm/ROCR-Runtime commit: e94ff8a36c]
2025-01-26 21:43:55 -05:00
David Yat Sin d0ae8b2eb5 rocr: Add support for gfx950
<squashed with patch for gfx950 generic targets>

Signed-off-by: Chris Freehill <Chris.Freehill@amd.com>

Change-Id: Ifec6d93cf46c7fbf736c6572882299e279260af6


[ROCm/ROCR-Runtime commit: dab8f2fc65]
2025-01-26 13:04:58 -05:00
Ben Vanik bda034bb82 rocr: Fix HostQueue to obey the alignment requirement
Change-Id: I06542e9ff94e826ca0abba0328b301fec50a95ea


[ROCm/ROCR-Runtime commit: 7d64fe49fa]
2025-01-24 12:08:11 -05:00
David Yat Sin 922b61ddee rocr: Add thread priority for AsyncEventHandler
Set priority to maximum for signal event handler and minimum for
exceptions event handler.

Change-Id: I1b982d3c2e4c880fafc073fe1a542d01692a6fdc


[ROCm/ROCR-Runtime commit: 7ea25ebb85]
2025-01-24 10:08:12 -05:00
Ben Vanik 15cc61baf4 rocr: Fixing non-portable inline attribute on hsa_flag_* utilities.
Change-Id: Ie1c53fef407a71b5ec4c6eaf3a3ed00871184408


[ROCm/ROCR-Runtime commit: 9971e7b004]
2025-01-23 15:09:21 -05:00
Tony Gutierrez e82871f20b rocr: Generalize driver discovery
Generalize the driver discovery and move driver-specific
functionality to the concrete driver implementations.
Currently, this process is tightly coupled to the hsakmt
which is GPU and OS specific.

Change-Id: Ie1c53fef407a71b5ec4c6eaf3a3ed00871184409


[ROCm/ROCR-Runtime commit: 15107afb11]
2025-01-23 15:09:14 -05:00
Tony Gutierrez c325fac9ba rocr: Make Open() and Close() virtual in Driver
Change-Id: Iac054c08383b080ca2b2ec6d65019bf2f083b763


[ROCm/ROCR-Runtime commit: 77fa5af618]
2025-01-23 15:09:06 -05:00
Tony Gutierrez d6504e1f2d rocr: Forward declare Driver in the Agent class
Change-Id: Ib27081bf31446af92602f723f352fb75ec3f378e


[ROCm/ROCR-Runtime commit: 8bbc44d51b]
2025-01-23 15:08:59 -05:00
Shweta Khatri c5822cee5a Revert "Revert "hsakmt: Only set exec flag when requested""
This reverts commit 5a8092bccf.

Reason for revert: This will put back the change ID - Id1154f08f6ba21c633905fd46b06053994d6f3cc to ROCR repo, which will prevent memory allocations from being automatically granted the 'executable' flag, addressing previously -  incorrect and unsafe behavior in ROCm driver.

Change-Id: I3d45c45859929a80f7791681b411251e099a1901


[ROCm/ROCR-Runtime commit: 2d4a578020]
2025-01-23 09:08:25 -05:00
Longlong Yao 0b1dc71200 rocr: add AMD_KERNEL_CODE_PROPERTIES_ENABLE_WAVEFRONT_SIZE32
Change-Id: I158705499f4ab0b1231d698d66902eb4ab1ececa
Signed-off-by: LonglongYao <Longlong.Yao@amd.com>


[ROCm/ROCR-Runtime commit: 5d8fba133d]
2025-01-22 13:02:31 -05:00
David Yat Sin 4c87e51013 Update license for 2025
Change-Id: Ie3c7f6034c9a73d9a4af3c1432ed7ac3b4a6a3b1


[ROCm/ROCR-Runtime commit: ff671f7550]
2025-01-21 15:28:57 -05:00
Swati Rawat 10004bd358 Update index.rst
Change-Id: I493e3dc3782608e4d0d712569a6e6fd3b376cdbe


[ROCm/ROCR-Runtime commit: 77c2a21a92]
2025-01-21 10:05:28 -05:00