From 77ed884c6f3cc87ea9ffda8fc232dede9ef996ca Mon Sep 17 00:00:00 2001
From: Julia Jiang <julia.jiang@amd.com>
Date: Fri, 20 Oct 2023 16:38:19 -0400
Subject: [PATCH] SWDEV-422771 - Update documents for error code and complex
 operations

Change-Id: I8f555b8e633e12593a3b8e8d58f6b950a3afc117


[ROCm/hip commit: 481fc5fcec1bb2a49b17a1b96454bc8b019329e0]
---
 projects/hip/docs/user_guide/faq.md | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/projects/hip/docs/user_guide/faq.md b/projects/hip/docs/user_guide/faq.md
index 378ad1ac7b..ca6634d3a5 100644
--- a/projects/hip/docs/user_guide/faq.md
+++ b/projects/hip/docs/user_guide/faq.md
@@ -26,7 +26,7 @@ At a high-level, the following features are not supported:
 - CUDA array, mipmappedArray and pitched memory
 - Queue priority controls
 
-See the [API Support Table](CUDA_Runtime_API_functions_supported_by_HIP.md) for more detailed information.
+See the [API Support Table](https://github.com/ROCm-Developer-Tools/HIPIFY/blob/amd-staging/docs/tables/CUDA_Runtime_API_functions_supported_by_HIP.md) for more detailed information.
 
 ### Kernel language features
 - C++-style device-side dynamic memory allocations (free, new, delete) (CUDA 4.0)
@@ -51,8 +51,6 @@ However, we can provide a rough summary of the features included in each CUDA SD
 - CUDA 5.0 :
     - Dynamic Parallelism (not supported)
     - cuIpc functions (under development).
-- CUDA 5.5 :
-    - CUPTI (not directly supported, [AMD GPUPerfAPI](http://developer.amd.com/tools-and-sdks/graphics-development/gpuperfapi/) can be used as an alternative in some cases)
 - CUDA 6.0 :
     - Managed memory (under development)
 - CUDA 6.5 :
@@ -262,7 +260,7 @@ Previously, it was essential to declare dynamic shared memory using the HIP_DYNA
 Now, the HIP-Clang compiler provides support for extern shared declarations, and the HIP_DYNAMIC_SHARED option is no longer required. You may use the standard extern definition:
 extern __shared__ type var[];
 
-## I have multiple HIP enabled devices and I am getting an error message hipErrorNoBinaryForGpu Unable to find code object for all current devices?
+## I have multiple HIP enabled devices and I am getting an error code hipErrorSharedObjectInitFailed with the message "Error: shared object initialization failed"?
 
 This error message is seen due to the fact that you do not have valid code object for all of your devices.
 
@@ -273,6 +271,9 @@ If you have a precompiled application/library (like rocblas, tensorflow etc) whi
  - The application/library does not ship code object bundles for *all* of your device(s): in this case you need to recompile the application/library yourself with correct `--offload-arch`.
  - The application/library does not ship code object bundles for *some* of your device(s), for example you have a system with an APU + GPU and the library does not ship code objects for your APU. For this you can set the environment variable `HIP_VISIBLE_DEVICES` or `CUDA_VISIBLE_DEVICES` on NVdia platform, to only enable GPUs for which code object is available. This will limit the GPUs visible to your application and allow it to run.
 
+Note: In previous releases, the error code is hipErrorNoBinaryForGpu with message "Unable to find code object for all current devices".
+The error code handling behavior is changed. HIP runtime shows the error code hipErrorSharedObjectInitFailed with message "Error: shared object initialization failed" on unsupported GPU.
+
 ## How to use per-thread default stream in HIP?
 
 The per-thread default stream is an implicit stream local to both the thread and the current device. It does not do any implicit synchronization with other streams (like explicitly created streams), or default per-thread stream on other threads.
@@ -285,6 +286,18 @@ Once source is compiled with per-thread default stream enabled, all APIs will be
 
 Besides, per-thread default stream be enabled per translation unit, users can compile some files with feature enabled and some with feature disabled. Feature enabled translation unit will have default stream as per thread and there will not be any implicit synchronization done but other modules will have legacy default stream which will do implicit synchronization.
 
+## How to use complex muliplication and division operations?
+
+In HIP, hipFloatComplex and hipDoubleComplex are defined as complex data types,
+typedef float2 hipFloatComplex;
+typedef double2 hipDoubleComplex;
+
+Any application uses complex multiplication and division operations, need to replace '*' and '/' operators with the following,
+- hipCmulf() and hipCdivf() for hipFloatComplex
+- hipCmul() and hipCdiv() for hipDoubleComplex
+
+Note: These complex operations are equivalent to corresponding types/functions on the NVIDIA platform.
+
 ## Can I develop applications with HIP APIs on Windows the same on Linux?
 
 Yes, HIP APIs are available to use on both Linux and Windows.