This change resolves some of the warnings generated during clr builds.
Quiet regular output of doxygen.
Disable non-documented warnings of doxygen.
Signed-off-by: Sebastian Luzynski <Sebastian.Luzynski@amd.com>
* SWDEV-558836, SWDEV-558837 - Add hipMemSetMemPool and hipMemGetMemPool implementation
* Add managed allocation type for mem pools
* Update rocprofiler-sdk with APis declaration
Updated to convert flags correctly
Added ObjectRegistry to track registered and mapped resources and incorporated it into hip_gl.
Added mip level check
Made functions static in-line
Reworked validation to be more clear.
* SWDEV-549518 - Enable logging dynamically through HIP APIS.
* SWDEV-549518 - Adding ROCProfiler related new API changes.
* rocprofiler-sdk changes for hip api additions.
---------
Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com>
Co-authored-by: jainprad <92369414+jainprad@users.noreply.github.com>
* Add hipDeviceAttributeExpertSchedMode
---------
Co-authored-by: Stefan Sokolovic <stefan.sokolovic2@amd.com>
* Update hipDeviceAttributeExpertSchedMode unit test
* Move check to ROCr from thunk interface
* Revert unrelated whitespace changes
* Revert version bump
---------
Co-authored-by: Stefan Sokolovic <stefan.sokolovic2@amd.com>
* Update what is hip
* Update HIP runtime page
* Update images
* Remove omnitrace
* Quick fix
* Feedback fixes
* Minor fixes
* Update SAXPY tutorial
Signed-off-by: Jan Stephan <jan.stephan@amd.com>
---------
Signed-off-by: Jan Stephan <jan.stephan@amd.com>
Co-authored-by: Adel Johar <adel.johar@amd.com>
Co-authored-by: Jan Stephan <jan.stephan@amd.com>
* Add HasExpertSchedMode device prop
* Add unit tests for HasExpertSchedMode
* Add gfx12 check for HasExpertSchedMode prop
* Update gfx major version check and test for ExpertSchedMode
* Minor fix and ROCr version bump
* Update projects/rocr-runtime/runtime/hsa-runtime/inc/hsa_ext_amd.h
* Update projects/rocr-runtime/runtime/hsa-runtime/inc/hsa_ext_amd.h
* Apply suggestion from @dayatsin-amd
* Apply suggestion from @dayatsin-amd
---------
Co-authored-by: Stefan Sokolovic <stefan.sokolovic2@amd.com>
Co-authored-by: David Yat Sin <77975354+dayatsin-amd@users.noreply.github.com>
* Update README documentation links for clarity and consistency across projects
- Changed links in the README files for `clr`, `hipother`, and `hip-tests` to use relative paths instead of absolute URLs, improving navigation within the repository.
* Update CONTRIBUTING documentation to use relative links for improved navigation
- Changed absolute URLs to relative paths in the CONTRIBUTING.md files for the hip and hipother projects, enhancing consistency and ease of access within the repository.
* [hip] Docs: Overhaul HW implementation page
* Update hardware implementation and glossary
* Update programming model
* Add performance optimization
* Split into how-to and understanding
---------
Signed-off-by: Jan Stephan <jan.stephan@amd.com>
Co-authored-by: Jan Stephan <jan.stephan@amd.com>
Co-authored-by: Julia Jiang <julia.jiang@amd.com>
* SWDEV-533237 Add initial support for hipOccupancyAvailableDynamicSMemPerBlock API
* SWDEV-533237 Add hipOccupancyAvailableDynamicSMemPerBlock wrapper for nvidia
* SWDEV-533237 Add implementation of hipOccupancyAvailableDynamicSMemPerBlock API
* SWDEV-533237 Add LDSAlignment field in Isa table
---------
Co-authored-by: Rahul Manocha <rmanocha@amd.com>
* Add examples to tools folder
* Correct P2P memory access section
* Sync poriting guide
* Add HIP Graph tutorial
* Add hint about using amdgpu-dkms for IPC API
* Add a few more env variables
* SWDEV-546311 - implement hipKernelGetLibrary & hipLibraryEnumerateKernels API
* Fix for LibraryEnumerateKernel and KernelGetName
* Update Enumerate Kernels to handle 0 numKernels
* Minor fixes to function names
* fix error checking in internal function
* Update changelog for new apis
---------
Co-authored-by: Rahul Manocha <rmanocha@amd.com>
1. Create a set of mini numa interface.
In Linux, the interface is based on system call rather than libnuma.
In Windows, the interface can also work, but the policy class is dummy.
Different from Linux, Windows doesn't provide numactl tool or numa lib to setup numa policy, thus
the default policy is followed in Windows, that is, using the closest host numa node to allocate
pinned host memory in hipHostMalloc().
To get the closest host numa node of a GPU device, you need query the new attribute
hipDeviceAttributeHostNumaId. Then you can create a thread with CPU affinity on the numa node.
For example, reference the test in hip-tests/catch/perftests/memory/hipPerfHostNumaAllocWin.cc.
2. Remove pfnSetThreadGroupAffinity and pfnGetNumaNodeProcessorMaskEx as the functions have been exposed since Win7 and Win server 2008.
3. Other minor fixes.
* SWDEV-554608 - Add hipHostRegisterIoMemory for hipHostRegister
* SWDEV-554608 - Add hipHostRegisterIoMemory for hipHostRegister
* SWDEV-554174 Added hipHostRegisterIoMemory flag in test cases
* SWDEV-554174 : Did formatting corrections
* SWDEV-554608 - set HSA_AMD_MEMORY_POOL_UNCACHED_FLAG if IoMemory is set
* SWDEV-554608 - set HSA_AMD_MEMORY_POOL_UNCACHED_FLAG if IoMemory is set
* SWDEV-554608 - Add hipHostRegisterIoMemory for hipHostRegister
---------
Co-authored-by: Anavena Venkatesh <Anavena.Venkatesh@amd.com>
Co-authored-by: Rambabu Swargam <rambabu.swargam@amd.com>
* SWDEV-545950 - Add hipStreamCopyAttributes API Implementation
* Add unit test for hipStreamCopyAttributes API
* Add ChangeLog and nvidia mapping for the API
* Update rocprofiler-sdk with new HIP API details
* [rocprofiler-sdk] handle hipStreamCopyAttributes in stream tracing service
- this new HIP function has multiple stream arguments and needs to be skipped because it does not have an explicit create/destroy/set functionality
* Update HIP_RUNTIME_API_TABLE_STEP_VERSION in clr and rocprofiler-sdk
* Resolve merge conflicts
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Return error when ext_fine_grain_pool is unavailable for
hipHostMallocUncached, hipHostAllocUncached and
hipExtHostRegisterUncached.
Disable related tests on Navi4x where
ext_fine_grain_pool is unavailable