Grafico dei commit

12 Commit

Autore SHA1 Messaggio Data
kjayapra-amd 0b788c4c67 SWDEV-236178 - Store texture reference metadata for dynamically loaded modules.
Change-Id: I99ecc80da7e29c691341a01a09e4532972f1e3e5
2020-06-11 22:34:50 -04:00
kjayapra-amd 840347f0d0 SWDEV-236178 - Reorganizing Platform/Modules code for easy access.
Change-Id: Ie8920260ffc4ff01e44b48af8cec9ea5aed1aa9b
2020-06-11 10:11:20 -04:00
kjayapra-amd 9261a35be9 SWDEV-234295 - Pass flag to ROCclr to not clear device programs during program::build()
Change-Id: I50b9fa1a96da6895f73fdf4a7c0d3f096b1188da
2020-06-05 09:53:11 -04:00
Jatin 126573df4c Adding changes for hipExtLaunchKernel for rocCLR
Change-Id: Iba52bc3bde7c37f3fb375a55ba0947e87b3cdc9b
2020-06-02 14:16:41 -04:00
Joseph Greathouse 90453b68d3 Fix occupancy calculation functions in ROCclr path
The hipOccupancyMaxPotentialBlockSize API is meant to return the
number of threads for the highest-occupancy workgroup, and the number
of those workgroups. It was previously calculating the number of
maximum-sized workgroups that would fit on a single CU. This is
a mixture of the API we wanted (to calculate max potential block size)
and the MaxBlocksPerMultiprocessor function.

This patch fixes it up so that the internal occupancy calculation
function works for two uses: the traditional function that calculates
the maximum blocks per multiprocessor when a user passes in a fixed
block size (used for hipMaxBlocksPerMultiprocessor style functions)
and a function that calculates the size of a block that would lead
to maximum occupancy, and how many blocks of that size would be
needed to fill the whole GPU (for hipOccupancyMaxPotentialBlockSize
style functions).

This also updates the occupancy calculation function to prepare for
gfx10, which does not have SGPR-based occupancy limits.

Change-Id: Ie007b3f9d5ebc4e166b50a3a051498af35650f35
2020-05-28 10:22:10 -05:00
Saleel Kudchadker facb05495f Fix elapsed time calculation for null stream
SWDEV-237377 - This fixes time calculation where the event may
be recorded on Null stream and work submitted on other streams

Change-Id: Ie36310dea5cee2fed4a514ed01f04db4b47e571c
2020-05-27 18:42:07 -04:00
Evgeny 10cb7645dc API tracing instrumentation
Change-Id: I257409b9fe299b009ded3e3a43287322d5f93a70
2020-05-14 11:03:09 -05:00
Vlad Sytchenko a88e52ba80 Correct HIP_FUNC_ATTRIBUTE_NUM_REGS query
Change-Id: I526cc7871c690260df0fa8c1b3b4b15fbc5af219
2020-05-09 12:42:30 -04:00
Vlad Sytchenko b5f9d2f818 Correct HIP_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK query
We should be returning the max workgroup size calculated by the compiler.

Change-Id: If86590efbb9b291f470bdbe87e5df992e661c539
2020-05-08 14:36:47 -04:00
Vlad Sytchenko 276bfc9667 Fix confusion in hipFuncGetAttribute()
Cuda shared == OpenCL local

Cuda local == OpenCL private

Change-Id: I5a204945ecde35919b9e9def20bbb2662fffea2b
2020-05-08 14:36:36 -04:00
kjayapra-amd 8f38c3260e SWDEV-232464 - Need to initialize image with ptr passed since they can pass image not of type __ClangOffloadBundler.
Change-Id: I2c50042220a0230bc445ed21728f114a229c53e1
2020-05-06 14:25:43 -04:00
Payam dba0e72de2 name change vdi to rocclr
Change-Id: I06d198bbb4a499e153b290b73a92afed3553b252
2020-05-06 09:14:30 -04:00