In pitch workaround, we need to copy the image to copy buffer
when application wants to read image to buffer. After this
patch, we unconditionally copy the image data to the copy buffer.
Change-Id: I71b0d19459542dfbb3ca51a2c8a3a81367fa2fb5
[ROCm/clr commit: 5330679473]
The existing workgroup calculation logic for GWS initialization is
incorrect. It tries to add together workgroups across dimensions,
leading to major under-count in 2D and 3D kernels. An (x,y,z) kernel
uses x * y * z blocks, not x + y + z.
In addition, the previous logic was incorrect for the case of launching
a single-threaded kernel. It calculated 0 workgroups, leading to
initializing GWS to -1.
Change-Id: I1bb20a0d5b6e0cc10ac55901c28d8f93aac61c09
[ROCm/clr commit: 54d1d69c0a]
- The logic will trace compute, sdma read/write operations and
apply signals when necessary
- ROC_CPU_WAIT_FOR_SIGNAL, ROC_SYSTEM_SCOPE_SIGNAL
and ROC_SKIP_COPY_SYNC were added to control the tracking
Change-Id: I9e8e6174c63bf7784f7ab00964e2918c8667d364
[ROCm/clr commit: dbc7abaecf]
- ROCR fails the call for some reason, then the signal will
become invalid and can hang on a wait. The logic will reset the
active signal in such cases
Change-Id: Ia131420200f1bbd7c9a162b8f1b06db8cecf41c6
[ROCm/clr commit: ce2e5eba6b]
- There is a performance regression with a HW wait for HSA signal
on ROCr async operation. For now move the logic back to CPU wait.
- Fix profiling issue with multiple HSA signal per single timestamp
object. Some copies require multiple ROCR calls and if profiling is
required, then the execution time is derived from all used signals.
Change-Id: Id003e4abb8c2de378eedc152a7e389500fc6f4ce
[ROCm/clr commit: 5a8946190a]
Remove targetId_, gfxipMajor_, gfxipMinor_ and gfxipStepping_ from
device::Info as they are now available in device::Isa.
Change-Id: I381b1d4798ebf50655740e004a01ac7f86dbf668
[ROCm/clr commit: c2308216dd]
- Correct GSL path to report targets using the TargetID syntax.
- Correct GSL path to check compatibility of code objects when
loading.
- Add concept of an device isa and create a registery used by ROCm,
PAL and GSL.
- Support XNACK and SRAMECC target features consistently for PAL and ROCm.
- Correct logic for NullDevices and asserts to avoid memory coruption.
- Allow all NullDevices to be created for HIP.
- Numerous other code improvements.
Change-Id: I40abf3d2b22249c1492d1af5919665f8184f4e0e
[ROCm/clr commit: c7e8d91e14]
Use strncpy instead of strcpy to ensure the arrays will not be
overflowered. Only copy one less than size of char array to leave a
NUL character at the end even if the copy is truncated provided the
original object is zeroed memory.
Change-Id: I00f7679630cf28dcb9a51cb0aba2810a4f4c72b9
[ROCm/clr commit: e0448535a3]
For roc devices create hsa_agent_t handles using a pointer to the
device::Device. This ensures each device has a different hsa_agent_t
handle. This may be necessary to ensure the loader symbol lookup will
search only symbols for the correct device.
Change-Id: Iee6dd40d68bf22a02ce8c75cbe5ac8f5a0d9e418
[ROCm/clr commit: 76c371d78a]
roc:Settings and pal::Settings are derivations. Allocate them as their
derived class then assign that to the base class member to avoid the
need for a static_cast.
Use device::settings to access the Settings consistently.
Change-Id: I0f85157962fbf6fed176da0caf83b723bcbe1452
[ROCm/clr commit: 583dddf6b6]
- Use std::unique_ptr to levarage RAII for managing the allocation in
the presence of errors.
Change-Id: I55de515bbf72938e1dd09731c5e51f538cf9d34a
[ROCm/clr commit: 682774a09d]
Make roc::Device::getBackendDevice non-virtual and only defined by
roc::Device and not roc::NullDevice. It is only meaningful for an
online device.
Change-Id: Ic333a3a42f650bea524e80dab587a34f1353e593
[ROCm/clr commit: 08f3126f6c]
Delete roc::Program::hsaDevice and access directly from device
associated with program. This allows to be clear if the device is a
NullDevice which has no meaningful HSA agent backend device.
Change-Id: I81f96aff47bf9b8166d0ff6a5efc7c01f0fb6de3
[ROCm/clr commit: 783fe2e01b]
Make roc::Device constructor and create() method private as creating
devices is performed by the factory static methods.
Change-Id: Ifa2edb8ec645b4ce6070c4aef355b9ef88294cf1
[ROCm/clr commit: c1ea70b539]
- Add assertions to enforce that objects are of the correct kind and
have been allocated.
- Make destructors check if objects have been allocated before
deleting.
- Operations that require a non-NullDevice return failure if given a
NullDevice.
- Use static_cast rather than reinterpret_cast when cohersing from a
base class to a derived class.
Change-Id: I02ee0ea9d7982fd7ca29d49c9b02cfae111b7127
[ROCm/clr commit: e5431676d4]
Rename functions that access devices to reflect the derived device
they return. This includes the base device::Device and the derived
gpu/pal/roc device classes in both NullDevice and Device forms. Change
to use the least derived versions to clarify what operations will be
available.
Change-Id: I1abb6bfed7efa24852bc8d0d49acaea357d8b5d0
[ROCm/clr commit: 001fd66cac]
Move declaration of kMaxAsyncQueues in rocdefs.h into the roc
namespace and adjacent to the other definitions.
Change-Id: Ibd319e3cc191945bacb9c06e1b31967717c1c87c
[ROCm/clr commit: f679b05df7]
Add virtual destructor to device::Settings since derived classes are
allocated, but are deleted using the base class. This ensures the
descructors of the derived clsses will be executed if present.
Change-Id: I1f974b986193c60128009a768ec6b01b9deeacd5
[ROCm/clr commit: 77268b2e60]
Move declaration of device::Info::targetId_ to be adjacent to the
other char* name fields.
Change-Id: Iefb249e801765a87b243a2a5e6997e78e817be2b
[ROCm/clr commit: a8d7e59dff]
When TargetID is supported, the isa name will contain ':' characters
that are not legal in Windows file names. So replace all
non-aphanumeric or '+'/'-' characters with '_' to ensure the file name
will be legal on any file system.
Change-Id: I0b73a6188c186f75f1d2e8af19ade87667cbfe0b
[ROCm/clr commit: ed6d54b416]
HMM with xnack enabled should automatically update page tables,
but currently it doesn't perform that. For now, runtime will
force page table update on all devices unconditionally.
Change-Id: Idfa6e1c145e6c114856214dce042b8a8349e5c58
[ROCm/clr commit: 7d3aaa7a39]