2
0
Gráfico de cometimentos

2449 Cometimentos

Autor(a) SHA1 Mensagem Data
David Belanger 9853cc28e5 Update shader for GFX12
Minor changes to instructions for GFX12.

Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: Iac5be900e3755099d83010fb1a2066b4dbb52dda
Signed-off-by: Chris Freehill <cfreehil@amd.com>


[ROCm/ROCR-Runtime commit: bde8e7a212]
2024-06-24 14:26:21 -05:00
David Belanger f06e7461d1 kfdtest: Updated KFDCWSRTest for GFX12
Updated ShaderStore shader (used by CWSR test) for GFX12.
Workgroup ID now pass in a different register.
Minor changes for new scope syntax.

Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: I6fdabc8b62cba201d7777a736d3d43cfae28ca4c
Signed-off-by: Chris Freehill <cfreehil@amd.com>


[ROCm/ROCR-Runtime commit: e086c383fe]
2024-06-24 14:26:21 -05:00
Jonathan Kim e2404e6311 kfdtest: fix address watch test for GFX12
New watchpoint exception status bits have to been assign to the first 4 least
significant bits so change test verification mask to check against the
first watch point ID accordingly.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: If83950207ea9f66cd230c23e7386a97b3893c2eb
Signed-off-by: Chris Freehill <cfreehil@amd.com>


[ROCm/ROCR-Runtime commit: 3b842c39f1]
2024-06-24 14:26:21 -05:00
Jonathan Kim 5a33ad7ec4 kfdtest: fixup test traphandler for gfx12
Fix traphandler for KFD debugger testing.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>

Change-Id: Ib8f5aac3d1b99e4463ac56b5f6d5dee2c367c447
Signed-off-by: Chris Freehill <cfreehil@amd.com>


[ROCm/ROCR-Runtime commit: a2e9226784]
2024-06-24 14:26:21 -05:00
David Belanger 88335be213 libhsakmt: Fix VGPR size for GFX12/GFX12.1
Set max size needed for VGPR when doing a CWSR for GFX12 and GFX12.1.

Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: Iddefc62f1ad419c6f5ab6a872048457a1dc24037
Signed-off-by: Chris Freehill <cfreehil@amd.com>


[ROCm/ROCR-Runtime commit: 259a724e21]
2024-06-24 14:26:21 -05:00
David Belanger 14881f6707 kfdtest: Added gfx1201 filter
Initial template for GFX12.0.1.

Change-Id: I5d2be1f594bf057c04f6feee75a80c61a9d7e4a8
Signed-off-by: David Belanger <david.belanger@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>


[ROCm/ROCR-Runtime commit: 197a6c2e6c]
2024-06-24 14:26:21 -05:00
David Belanger 02d087ae25 kfdtest: Add support for GFX12
Added FAMILY_GFX12 code.

Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: I26f01055b3c8732b4b6e1195d34533d9f89032d2
Signed-off-by: Chris Freehill <cfreehil@amd.com>


[ROCm/ROCR-Runtime commit: 7d2c78a37d]
2024-06-24 14:26:21 -05:00
David Belanger aeffc30a1d kfdtest: Added gfx1200 filter.
Initial template for GFX12.

Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: I552374bfcc0dd6272d170df85d36d0dbca0196d5
Signed-off-by: Chris Freehill <cfreehil@amd.com>


[ROCm/ROCR-Runtime commit: 09744e4959]
2024-06-24 14:26:21 -05:00
James Zhu 259af9e854 libhsakmt: update KFD ioctl minor version
Since PC Sampling not upstream yet, so use 1.16 for
contiguous VRAM allocation, and 1,17 for pc sampling.

Change-Id: Ib5d22e8f386ce7fe3f7111485b9632b61227e539
Signed-off-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>


[ROCm/ROCR-Runtime commit: 5786dbbb76]
2024-06-24 14:26:21 -05:00
James Zhu f0d3c72605 kfdtest: skip test when PC Sampling is not supported by ASIC
Skip test when PC Sampling is not supported by ASIC.

Change-Id: I6f9be0bdaed66e51052723b6df6908079470cefb
Signed-off-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>


[ROCm/ROCR-Runtime commit: 1087dea925]
2024-06-24 14:26:21 -05:00
Jonathan Kim 23c0cf8727 libhsakmt: fix pc sampling return of functions
C Error returns are positive in user space and should check against errno
instead.
Fix declaration of return to type HSAKMT_STATUS.
KFD IOCTL should handle size return when querying capabilities so return
size to caller unconditionally.
Clean up error translations per function so that it's stylistically
clear.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: Ic37390425f370c7ad88f9ed014444decf19383a3
Signed-off-by: Chris Freehill <cfreehil@amd.com>


[ROCm/ROCR-Runtime commit: 206db80a56]
2024-06-24 14:26:21 -05:00
Kent Russell af2a46cfd6 kfdtest.exclude: Fix blacklist
We need : to end each subtest, except for the last entry.

Change-Id: I9515d90703c9679e06a4acd124883540c1d5b832
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>


[ROCm/ROCR-Runtime commit: 371d078226]
2024-06-24 14:26:21 -05:00
Chris Freehill 8e9cb92cda Merge 'thunk/integrate-into-rocr' into integrate-libhsakmt
[ROCm/ROCR-Runtime commit: 79e4eda0b6]
2024-05-02 21:52:49 -05:00
David Yat Sin e59ac6361e rocrtst: add test for contiguous mem allocations
This test may fail when run on non-upstream versions of KFD as this
feature will not be upstreamed.

Change-Id: I7131e1f50984739c0df12e4c9afe790bd7e4cdfa


[ROCm/ROCR-Runtime commit: d2d95a8948]
2024-04-30 17:42:15 -04:00
David Yat Sin c53e11ec20 Temporary: Do not early release mutex when not ganging
It seesm the Release() function is not reliable and can cause segfaults.
This is a temporary work-around until the Release() function is fixed.

Change-Id: I95470a800c6153673e4b8f4fe46a646903325074


[ROCm/ROCR-Runtime commit: ac5fb8be9e]
2024-04-30 17:07:39 -04:00
Chris Freehill a8d049fa0d Prepare for integration into rocr
Change-Id: I6102b9910dbb9d09e09bb262a03c5c0ad4ce66f4


[ROCm/ROCR-Runtime commit: 11fd5c2562]
2024-04-30 09:01:09 -05:00
David Yat Sin 860be91593 Use pthread_attr_setaffinity_np when available
If pthread_attr_setaffinity_np function exists use it instead of
pthread_setaffinity_np as pthread_setaffinity_np seems to fail to set
the affinity settings on some systems.

Change-Id: Icd8b17039699ac10d9cd5c4dbb6ac44630673949


[ROCm/ROCR-Runtime commit: 57b93e02a4]
2024-04-29 15:02:54 +00:00
David Yat Sin d803a2ceb8 Bump HSA_AMD_INTERFACE_VERSION_MINOR
Bumping HSA_AMD_INTERFACE_VERSION_MINOR version to 5 to account for
previously added GPU agent query: HSA_AMD_AGENT_INFO_MEMORY_PROPERTIES

Change-Id: Ic8cfdcfb7bad6f3d1e0b3d68f505a62074fc26b9


[ROCm/ROCR-Runtime commit: b6829f7a72]
2024-04-29 12:55:18 +00:00
Kent Russell 0e9ad5e1a4 .github: Add CODEOWNERS file
Change-Id: Ia763b91177f1ae09d16e5968bed17b0dba62cbe5
Signed-off-by: Kent Russell <kent.russell@amd.com>


[ROCm/ROCR-Runtime commit: 5e1f24f305]
2024-04-26 09:21:39 -04:00
amd-jmacaran 3fc997dd0f Change token name to match IT-created token
Change-Id: Ic9189c012024c59cf5bad9daf25f6c2575a100fd


[ROCm/ROCR-Runtime commit: 587e4287f4]
2024-04-25 12:23:28 -04:00
David Yat Sin defe5ac509 Perform HDP flush for SDMA copies gfx10/gfx11
Perform HDP flush on gfx10/gfx11 PCIe devices.

Exclude gfx101x devices

Change-Id: Ief76c34634b09b0a7942cb71519d4082ca8b4fad


[ROCm/ROCR-Runtime commit: 3d999a1adf]
2024-04-24 18:07:34 -04:00
David Yat Sin b53648f8fe Add support for contiguous memory allocations
Support contiguous physical memory allocation flag. Allocations with
this flag will have contiguous physical memory. This is dependent on KFD
support for this flag and the AllocateKfdMemory(..) function call will
fail when it is not supported.

Change-Id: I6c51c8b061f7b026fdcc2aa2c37c74ecc13d95b6


[ROCm/ROCR-Runtime commit: 9af225e1b1]
2024-04-24 14:02:07 -04:00
David Yat Sin fa6f3477ad Remove assert for physical vs virtual memory size
On systems with more than 1 TB of memory per NUMA region, this triggers
unnecessary errors.

Change-Id: I1bc7f209b9c1739b516c9f6b0acf434488ac7b8d


[ROCm/ROCR-Runtime commit: e539c8dce2]
2024-04-24 08:43:23 -04:00
amd-jmacaran 278631d135 Add support for external CI builds using Azure Pipelines
Change-Id: I8f4de331f00317a959b86f7e5b7a1025ba03564b


[ROCm/ROCR-Runtime commit: 8a893ea0b8]
2024-04-23 21:10:49 -04:00
David Galiffi 495db71f32 Fixed MD linting issue regarding code blocks.
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Change-Id: Id0467b332bf033642a2d403090ffe598e41689f5


[ROCm/ROCR-Runtime commit: a8bd453243]
2024-04-23 16:48:15 -04:00
David Galiffi 6e5eed56e4 Fixed broken link to ROCm documentation
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Change-Id: I9a6bc0ad08a060d83fdc3a0589dfc81c68ce2b0e


[ROCm/ROCR-Runtime commit: 975c5dd24a]
2024-04-23 16:47:50 -04:00
David Galiffi bd35a68731 Fixed MD linting issue regarding headers
They should start at H1

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Change-Id: Id11a2599c4609255a1a9916f70b58adc41cdddb4


[ROCm/ROCR-Runtime commit: f94c1794bb]
2024-04-23 16:47:19 -04:00
David Galiffi 93869e57ba Update GitHub links to point to the new organization.
ie., RadeonOpenCompute --> ROCm

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Change-Id: I6724cbcbbb525f767af297e3986cd61fa69cd49f


[ROCm/ROCR-Runtime commit: c9103a00ef]
2024-04-23 16:46:42 -04:00
Philip Yang a580869abd libhsakmt: Support contiguous VRAM allocation flag
Add HsaMemFlags Contiguous bit for hsaKmtAllocMemory to allocate
contiguous VRAM, to support RDMA device with limited scatter-gather
ability.

Check KFD ioctl minor version >= 17.

Change-Id: I0db00dad125b2b7be523f343082641f59b850423
Signed-off-by: Philip Yang <Philip.Yang@amd.com>


[ROCm/ROCR-Runtime commit: 97497c7efc]
2024-04-23 14:27:12 -04:00
Your Name f4ff320797 libhsakmt: Remove HsaMemFlag reserved bit init
HsaMemFlag new flags added and the number of the reserved bits is
reduced, and generate value overflow compilanation error.

The reserved bits is not used, remove the init.

Change-Id: I603596977dfd558ce31ead03711d7c5ce5ee5b71
Signed-off-by: Philip Yang <Philip.Yang@amd.com>


[ROCm/ROCR-Runtime commit: e2d742ac6f]
2024-04-23 14:27:12 -04:00
David Yat Sin 5bf08d0d70 Fix queue creation for PC Sampling
Fix lazy pointer initialization for dedicated PC Sampling queue.
Previous implementation would always create a queue on GPU agent
creation instead of creating the queue on first use.

Change-Id: Icf300f2b162e59143ba61ba182d9bee6e1308fc1


[ROCm/ROCR-Runtime commit: f2751b7030]
2024-04-22 19:00:48 +00:00
Shweta.Khatri 4f4d215196 Fixing compilation errors related to MUSL libc
Fix Musl libc NULL errors and unsupported pthread funcs for compatibility.
Also ensures cleanup and error handling irrespective of CPU affinity override.

Fix submitted by github dev - AngryLoki
https://github.com/ROCm/ROCR-Runtime/issues/181

Change-Id: Ia487315e504112be5d3370756f23f6e23b9ae4be


[ROCm/ROCR-Runtime commit: bc9cac97fe]
2024-04-17 07:14:15 -04:00
David Yat Sin ed7cdb88e4 Adding queue information queries
New hsa_amd_queue_get_info API to support:

- HSA_AMD_QUEUE_INFO_AGENT: Agent that owns the underlying HW queue

- HSA_AMD_QUEUE_INFO_DOORBELL_ID: KFD doorbell ID of the queue
completion signal.

Change-Id: I98842131bcbdd08552649791a5d43e578a615808


[ROCm/ROCR-Runtime commit: d6d5786051]
2024-04-11 12:53:48 -04:00
David Yat Sin 395fd2d230 PC Sampling: Disable coredump when sessions active
When doing a coredump, we try to park the wave and save its PC in
ttmp7/ttmp11, but these registers will be overwritten by PC Sampling
requests.

Change-Id: I60fb734eb3bed4ee3cc8d8bba9ec4a527fff9671


[ROCm/ROCR-Runtime commit: 3443fdf665]
2024-04-11 12:53:43 -04:00
David Yat Sin 52fc6e8619 PC Sampling: Convert timestamps to system time
Convert timestamps inside samples to system time

Change-Id: I5fad9a6887fa27c0ded9aa9b5f251cba2868f88f


[ROCm/ROCR-Runtime commit: 49e56ce782]
2024-04-11 12:53:37 -04:00
David Yat Sin a84d407118 PC Sampling: Implement lost sample count
Change-Id: Idfdfbac71c1813dd7a97c301619cf8ce83713c53


[ROCm/ROCR-Runtime commit: 547c9cb143]
2024-04-11 12:53:31 -04:00
David Yat Sin 3d13db3c39 PC Sampling: Implement flush
Flush is used by the client to retrieve data that are currently stored
in the buffers. This is used by the client to retrieve current data when
the buffers are not full.

Change-Id: Ib8304dcdfb2797cb060ec72df4970d95cf6be348


[ROCm/ROCR-Runtime commit: 8abbf9475b]
2024-04-11 12:53:24 -04:00
David Yat Sin b0d88f22dd PC Sampling: Push data to PC Sampling client
Each time there is enough data to fill the client session buffer,
callback the client data ready function to transfer the buffer contents
to the client.

Change-Id: Id79775426fa6d22e00dc2ef6f55c439eacb9b2af


[ROCm/ROCR-Runtime commit: 5177d17f5d]
2024-04-11 12:53:17 -04:00
David Yat Sin d8b7c22609 PC Sampling: Retrieve data from trap handler
Retrieve data from the buffers previously set in the 2nd level trap
handler TMA. We use a double buffering mechanism to allow the 2nd level
trap handler to write to one buffer while we are copying data from the
other.

Co-authored by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Co-authored by: James Zhu <James.Zhu@amd.com>

Change-Id: I252c381ea06b8cf927c4f9af6ea59dedc3717fbb


[ROCm/ROCR-Runtime commit: 855e454671]
2024-04-11 12:53:12 -04:00
David Yat Sin b11294e555 PC Sampling: Update 2nd level trap handler
Update 2nd level trap handler when PC Sampling is enabled

Change-Id: I95bf2bca8057d2f8313923c7f012f033e12ccc3a


[ROCm/ROCR-Runtime commit: efdb72fd71]
2024-04-11 12:53:06 -04:00
David Yat Sin bb10ff65c2 PC Sampling: Allocate resources to retrieve data from trap handler
Allocate required device and host buffers to be able to interact with
the 2nd level trap handler.

Change-Id: If99de5aacf956ca57ecafc7b04b797be9c9decaa


[ROCm/ROCR-Runtime commit: 8d666dea01]
2024-04-11 12:53:00 -04:00
Joseph Greathouse 5c61936483 PC Sampling: Add gfx9 2nd trap handler for PC Sampling
Code is valid for gfx9 GPUs excluding gfx94x.

1st level trap handler will use TTMP13[22] to indicate host trap and
TTMP13[21] to indicate stochastic trap.

For each PC sampling method (hosttrap and stochastic), we use a double
buffering mechanism to transfer data between GPU and host.
The GPU will dump data into one buffer while CPU may be reading data
from the other buffer. There are 2 separate signals, one for each
buffer.
When signal != 0, the buffer belongs to the GPU and the GPU can write
to it. Once the buffer has reached the high watermark, the GPU will
set the signal to 0 to wake up the host and so that the host can try
to switch the buffers and read the data.

Co-authored-by: David Yat Sin <David.YatSin@amd.com>
Change-Id: If3eb0913e52fb4788059a71e5feca334612f3d5d


[ROCm/ROCR-Runtime commit: 431a70471e]
2024-04-11 12:52:54 -04:00
David Yat Sin c3f9368b8f PC Sampling: Create dedicated CP queue
Create dedicated CP queue with highest priority for PC Sampling. Reduce
the highest priority that LRT's can set for existing API so that PC
Sampling queue will always have highest priority over any other CP
queues

Change-Id: Ia70d74415edc83b4862a3e18dbdbd7cebe73ab47


[ROCm/ROCR-Runtime commit: a83f872a23]
2024-04-11 12:52:48 -04:00
David Yat Sin bcdecc7ff4 PC Sampling: Add start stop and flush APIs
Create PC Sampling APIs for start and stop functions. And create stub
for flush function.

Change-Id: I7a093b29dc87e34ac06faaae6cac2be50e4663e1


[ROCm/ROCR-Runtime commit: a842247482]
2024-04-11 12:52:42 -04:00
David Yat Sin 566e2c60fd PC Sampling: Add create and destroy APIs
Implement PC Sampling session create and destroy APIs.

Change-Id: I93370d3d01b74ee15e71b8b0e20feb8f0066a3dc

Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Signed-off-by: Vladimir Indic <Vladimir.Indic@amd.com>
Change-Id: Ib0c64356a1a4616b12d5dbeebe16273fe2a84abe


[ROCm/ROCR-Runtime commit: 632f9e60f7]
2024-04-11 12:52:35 -04:00
David Yat Sin 0a4415f202 PC Sampling: API to list supported configurations
Add new PC Sampling API to list the supported PC Sampling methods and
options on a specific agent. If there is already a PC Sampling session
active on this agent, the list of methods returned will be reduced to
methods that can be run simultaneously with the current active session.

Change-Id: I42ac2b8f30d5c368faf8ed4cf37ca4134db22985


[ROCm/ROCR-Runtime commit: 295acf6b27]
2024-04-11 12:52:30 -04:00
David Yat Sin 8165c03e7b PC Sampling: Create PC Sampling interfaces
Create new interface group for PC Sampling

Change-Id: I59b4cfe9f8d1ae313dc28be1d2ed49f750d8212b


[ROCm/ROCR-Runtime commit: 0bc244e10a]
2024-04-11 12:52:23 -04:00
David Yat Sin b79e044711 PC Sampling: Update public headers for new APIs
Change-Id: Ib9987efdb41d5f6d203e7e86f9b26809d020e04e


[ROCm/ROCR-Runtime commit: 6a7122b183]
2024-04-11 12:52:16 -04:00
David Yat Sin bc77ec44f6 Create fine-grained allocator
Create allocator helper function to provide fine-grained memory on
a specific agent.

Change-Id: I32ba9aceb9c9dc708b140a0c45158e6e7a018844


[ROCm/ROCR-Runtime commit: 71f1a6726c]
2024-04-11 12:52:10 -04:00
David Yat Sin 6b38aecae9 Extend ExecutePM4() to accept completion signal and fences
ExecutePM4() function can optionally accept extra arguments for
acquire fence scope, release fence scope andcompletion signal. When
a completion signal is provided, ExecutePM4() does not wait for the
commands to complete.

Change-Id: Ib2a433b7bce1cb6260be8b76fe902335bd5dfada


[ROCm/ROCR-Runtime commit: 721e56ef5c]
2024-04-11 12:51:52 -04:00