SWDEV-502480 - Update documentation from GitHub 2024-12-05
Change-Id: I179814351b77935aff55e8ae47dd322a3e15a868
[ROCm/hip commit: f39c7a3150]
This commit is contained in:
@@ -41,7 +41,7 @@ the host or parallel to the devices.
|
||||
|
||||
For parallel workloads, when threads belonging to the same block need to
|
||||
synchronize to share data, use :cpp:func:`__syncthreads()` (see:
|
||||
:ref:`synchronization functions`) within the same kernel invocation. For threads
|
||||
:ref:`synchronization_functions`) within the same kernel invocation. For threads
|
||||
belonging to different blocks, use global memory with two separate
|
||||
kernel invocations. It is recommended to avoid the latter approach as it adds
|
||||
overhead.
|
||||
@@ -151,7 +151,7 @@ and is generally reduced when addresses are more scattered, especially in
|
||||
global memory.
|
||||
|
||||
Device memory is accessed via 32-, 64-, or 128-byte transactions that must be
|
||||
naturally aligned.
|
||||
naturally aligned.
|
||||
Maximizing memory throughput involves:
|
||||
|
||||
- Coalescing memory accesses of threads within a warp into minimal transactions.
|
||||
@@ -294,7 +294,7 @@ Applications frequently allocating and freeing memory might experience slower
|
||||
allocation calls over time as memory is released back to the operating system.
|
||||
To optimize performance in such scenarios, follow these guidelines:
|
||||
|
||||
- Avoid allocating all available memory with :cpp:func:`hipMalloc` or
|
||||
- Avoid allocating all available memory with :cpp:func:`hipMalloc` or
|
||||
:cpp:func:`hipHostMalloc`, as this immediately reserves memory and might
|
||||
prevent other applications from using it. This behavior could strain the
|
||||
operating system schedulers or prevent other applications from running on the
|
||||
@@ -309,7 +309,7 @@ To optimize performance in such scenarios, follow these guidelines:
|
||||
performance, they allow the application to continue running.
|
||||
- For supported platforms, use :cpp:func:`hipMallocManaged`, as it allows
|
||||
oversubscription. With the right policies, :cpp:func:`hipMallocManaged` can
|
||||
maintain most, if not all, :cpp:func:`hipMalloc` performance.
|
||||
maintain most, if not all, :cpp:func:`hipMalloc` performance.
|
||||
:cpp:func:`hipMallocManaged` doesn't require an allocation to be resident
|
||||
until it is needed or prefetched, which eases the load on the operating
|
||||
system's schedulers and facilitates multitenant scenarios.
|
||||
|
||||
مرجع در شماره جدید
Block a user