- Correct GSL path to report targets using the TargetID syntax.
- Correct GSL path to check compatibility of code objects when
loading.
- Add concept of an device isa and create a registery used by ROCm,
PAL and GSL.
- Support XNACK and SRAMECC target features consistently for PAL and ROCm.
- Correct logic for NullDevices and asserts to avoid memory coruption.
- Allow all NullDevices to be created for HIP.
- Numerous other code improvements.
Change-Id: I40abf3d2b22249c1492d1af5919665f8184f4e0e
- Add assertions to enforce that objects are of the correct kind and
have been allocated.
- Make destructors check if objects have been allocated before
deleting.
- Operations that require a non-NullDevice return failure if given a
NullDevice.
- Use static_cast rather than reinterpret_cast when cohersing from a
base class to a derived class.
Change-Id: I02ee0ea9d7982fd7ca29d49c9b02cfae111b7127
- Correct spelling mistakes or working in comments.
- Adding missing line separators.
- Add missing comments for namespace closing brace.
Change-Id: If09cdd38aa088b0f68f750dfdef81351eb8c4935
[PAL to KFD/ROCr][ROCr_Runtime][Vega10] OCLSeparateCompile subtest of
oclcompiler from ocltst test package is encountering clLinkProgram()
failed (chksum 0x00000001) error
If runtime does not provide a file name as dump file to ELF library,
ELF library use a temp file in current folder.
The current folder can be not writable for several reasons:
1. The application current folder might be system folder, the user
does not have write permission.
2. The current folder is under a readonly file system. This happens for
embedded customers.
Tested in VEGA10. Issue was fixed.
Change-Id: Ic0e9f040b7c7583914301673cce237ab28b0c0cb
Replace amd::Atomic with std::atomic. Remove make_atomic uses by
converting the variable to std::atomic and making sure the memory
order is relaxed when synchronizes-with is not needed.
Delete utils/atomic.hpp.
Change-Id: I0b36db8d604a8510ac6e36b32885fd16a1b8ccfa
Set top init_priority on affecting global variables so that
they will be created firstly and destroyed lastly.
Change-Id: Ied59fbecab66ba8195c4a7a02b6bef9fa2fad3af
TF doesn't reserve all available memory now. If any
client wants to reserve they can explicitly set
HIP_HIDDEN_FREE_MEM env var
Change-Id: Ied3a948b79f49aa7327f6a820e9789e39cec143b
This workaround is to avoid performance penalty of SDMA engine
taking a while to clock up from a lower DPM state. Add env var
GPU_FORCE_BLIT_COPY_SIZE (1024 by default for HIP in KB). Forcing
Src and Dst agent to be amdgpu makes ROCr take blit copy path for
what otherwise should have been SDMA copy
Change-Id: I222f687155f86000d17d66d25182e490b6710463
Don't error when querying the number of devices if there are no devices present in the system.
We should just return 0 for the number of devices in this case and let the application handle this situation.
Change-Id: I20614ade5e649f3ce9ddd970d4b38bfe296f6cdb
~45% to 50% of Performance drop on rocBLAS_int8 test
Add support for active waits without blocking the host thread.
Change-Id: Ie7bb48dcafcb4c93d448bf74749b829b626c3578