From 024c7d51ea41660ca0f647323c14ae2eb67bb8c2 Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu" <yaxun.liu@amd.com>
Date: Mon, 6 Jan 2020 02:02:38 -0500
Subject: [PATCH] Document FMA settings (#1717)

[ROCm/clr commit: 7dcd5f63290ad792f46cf47c3c41762554b859a2]
---
 .../clr/hipamd/docs/markdown/hip_programming_guide.md | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/projects/clr/hipamd/docs/markdown/hip_programming_guide.md b/projects/clr/hipamd/docs/markdown/hip_programming_guide.md
index 4a4300e357..b87d99e7f2 100644
--- a/projects/clr/hipamd/docs/markdown/hip_programming_guide.md
+++ b/projects/clr/hipamd/docs/markdown/hip_programming_guide.md
@@ -115,4 +115,15 @@ allocated.
 
 In HCC and HIP-Clang, long double type is 80-bit extended precision format for x86_64, which is not supported by AMDGPU. HCC and HIP-Clang treat long double type as IEEE double type for AMDGPU. Using long double type in HIP source code will not cause issue as long as data of long double type is not transferred between host and device. However, long double type should not be used as kernel argument type.
 
+## FMA and contractions
+
+By default HIP-Clang assumes -ffp-contract=fast and HCC assumes -ffp-contract=off.
+For x86_64, FMA is off by default since the generic x86_64 target does not
+support FMA by default. To turn on FMA on x86_64, either use -mfma or -march=native
+on CPU's supporting FMA.
+
+When contractions are enabled and the CPU has not enabled FMA instructions, the
+GPU can produce different numerical results than the CPU for expressions that
+can be contracted. Tolerance should be used for floating point comparsions.
+
 ## [Supported Clang Options](clang_options.md)