Wykres commitów

12 Commity

Autor SHA1 Wiadomość Data
kjayapra-amd 173bb2af6e SWDEV-236178 - Store texture reference metadata for dynamically loaded modules.
Change-Id: I99ecc80da7e29c691341a01a09e4532972f1e3e5
2020-06-11 22:34:50 -04:00
kjayapra-amd 20f05c4228 SWDEV-236178 - Reorganizing Platform/Modules code for easy access.
Change-Id: Ie8920260ffc4ff01e44b48af8cec9ea5aed1aa9b
2020-06-11 10:11:20 -04:00
kjayapra-amd 8941d19fe8 SWDEV-234295 - Pass flag to ROCclr to not clear device programs during program::build()
Change-Id: I50b9fa1a96da6895f73fdf4a7c0d3f096b1188da
2020-06-05 09:53:11 -04:00
Jatin 2d517fdcc6 Adding changes for hipExtLaunchKernel for rocCLR
Change-Id: Iba52bc3bde7c37f3fb375a55ba0947e87b3cdc9b
2020-06-02 14:16:41 -04:00
Joseph Greathouse ebe5054e04 Fix occupancy calculation functions in ROCclr path
The hipOccupancyMaxPotentialBlockSize API is meant to return the
number of threads for the highest-occupancy workgroup, and the number
of those workgroups. It was previously calculating the number of
maximum-sized workgroups that would fit on a single CU. This is
a mixture of the API we wanted (to calculate max potential block size)
and the MaxBlocksPerMultiprocessor function.

This patch fixes it up so that the internal occupancy calculation
function works for two uses: the traditional function that calculates
the maximum blocks per multiprocessor when a user passes in a fixed
block size (used for hipMaxBlocksPerMultiprocessor style functions)
and a function that calculates the size of a block that would lead
to maximum occupancy, and how many blocks of that size would be
needed to fill the whole GPU (for hipOccupancyMaxPotentialBlockSize
style functions).

This also updates the occupancy calculation function to prepare for
gfx10, which does not have SGPR-based occupancy limits.

Change-Id: Ie007b3f9d5ebc4e166b50a3a051498af35650f35
2020-05-28 10:22:10 -05:00
Saleel Kudchadker fb2d7bcd2b Fix elapsed time calculation for null stream
SWDEV-237377 - This fixes time calculation where the event may
be recorded on Null stream and work submitted on other streams

Change-Id: Ie36310dea5cee2fed4a514ed01f04db4b47e571c
2020-05-27 18:42:07 -04:00
Evgeny 5abb8e1a68 API tracing instrumentation
Change-Id: I257409b9fe299b009ded3e3a43287322d5f93a70
2020-05-14 11:03:09 -05:00
Vlad Sytchenko fec51e85b0 Correct HIP_FUNC_ATTRIBUTE_NUM_REGS query
Change-Id: I526cc7871c690260df0fa8c1b3b4b15fbc5af219
2020-05-09 12:42:30 -04:00
Vlad Sytchenko 1b1c032e9f Correct HIP_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK query
We should be returning the max workgroup size calculated by the compiler.

Change-Id: If86590efbb9b291f470bdbe87e5df992e661c539
2020-05-08 14:36:47 -04:00
Vlad Sytchenko a373538d72 Fix confusion in hipFuncGetAttribute()
Cuda shared == OpenCL local

Cuda local == OpenCL private

Change-Id: I5a204945ecde35919b9e9def20bbb2662fffea2b
2020-05-08 14:36:36 -04:00
kjayapra-amd 5e91bee221 SWDEV-232464 - Need to initialize image with ptr passed since they can pass image not of type __ClangOffloadBundler.
Change-Id: I2c50042220a0230bc445ed21728f114a229c53e1
2020-05-06 14:25:43 -04:00
Payam c5f76c3de3 name change vdi to rocclr
Change-Id: I06d198bbb4a499e153b290b73a92afed3553b252
2020-05-06 09:14:30 -04:00