This change addresses the rocprofiler and HIP backward compatibility
issues. Before this patch, each time the hip_prof_str.h header was
generated, the ordering of the callbacks IDs changed, causing
incompatibilities between tools compiled with the old header and
runtimes compiled with the new headers (or vice versa).
To make the API callback IDs stable, the previous version of the header
is read to extract the enum values so that the same values can be
assigned in the new header.
Also, to make diffing different versions of the hip_prof_str.h easier
to read, all other sections (types, macros, helper functions) are now
alphabetically ordered.
If an update to the checked-in hip_prof_str.h file is required, the
cmake build is aborted and a message printed on stderr. The build will
not be successful until the checked-in hip_prof_str.h and the generated
hip_prof_str.h match.
Change-Id: I38b920e601185f7365a76a6584df91a7e8a11798
Re-enable __HIPCC_RTC__ in hip_vector_types.h which require
an upstream clang patch, 6823af0ca858b54e09e5be61a19d067ccd0bd6b7.
Once upstream patch has landed in mainline, merge this for
hiprtc functionality.
Change-Id: Ife884e0c3081b307bdadc8bec7804d1d7c60153b
Add coarse grain memory extension. The new advice will allow HMM
to disable cache coherency policy to improve performance
Change-Id: I3c792d6a96896b983a7ffccddaa0ded06d183212
Temporarily disable __HIPCC_RTC__ in hip_vector_types.h
while the upstream clang headers are outdated on mainline.
Once upstream patch has landed in mainline, revert this
change. This is a workaround for hiprtc testing.
Change-Id: Ib2cf6023b71431bbfbe3c699076caa4f90f7170c
Windows may expect long and ulong to be 4 bytes, while
Linux expects 8 bytes. Instead, use uint64_t for
unsigned long, and unsigned long long, and use int64_t
for long and long long to be consistent.
Change-Id: I6ed1cdde43721bcaaab0245644d607b1adbf9884
For hipRTC on Windows, add macro __HIPCC_RTC__ to allow
online compilation of with device functions excluding standard
C/C++ headers, system headers, and host HIP APIs.
Change-Id: I1d91f042baf1359856ec83ab7030dc58785e0334
On StreamBegincapture captures the parameters passed to APIs and respective node will be created and added to graph
All parameters are passed to STREAM_CAPTURE macro, it checks if stream in capture mode and redirects the call to the capture function and returns
Updated hipStream and hipEvent with capture parameters
Added handling for hipStreamBeginCapture & hipStreamEndCapture
Change-Id: Ic8926a7b4336c2cc81f0b3a9a224aa392c474134
Selector indices are as follows (the upper 16-bits of the selector are not used): selector[0] = s<2:0> selector[1] = s<6:4> selector[2] = s<10:8> selector[3] = s<14:12>
Change-Id: Ibf76c6ec2374f1f5b9bba8bd9dbd73660f830eea
Change-Id: I5daeacd9dd5c6ce7f914d6e6e45dd41fb2a675a5
hipMemRangeGetAttributes was returning hipErrorInvalidValue due to improper
mapping of the arguments to cudaMemRangeGetAttributes.
Add concurrentManagedAccess detection in hipMallocManaged test.
Skip test when device doesn't support concurrentManagedAccess.
Change-Id: Ie54046feef3baba857a7068972ec1fc1a60c2dfd
HIP headers use few structure names as X, Y, and Z. This causes
compilation issues when the apps use similar names as macros.
Renamed the struct names to use reserved names such as
__X, __Y and __Z
Change-Id: I59416c3734f274e853c87d4856b7e616f6cee5f5