Граф коммитов

4391 Коммитов

Автор SHA1 Сообщение Дата
Rahul Garg edc97f3073 Add hipDrvOccupancyMaxActiveBlocksPerMultiprocessor[WithFlags] (#1854)
Equivalent to cuOccupancyMaxActiveBlocksPerMultiprocessor[WithFlags].
2020-02-28 16:46:55 +05:30
jiabaxie af90312867 Cleaned up error messages for HipEnvVarDriver test (#1825)
There were several error messages that appeared even if the hipEnvVarDriver.exe test passes and executes successfully. Now it is cleaned up. The following are those instances:

* When popen searches for directed_test directory but does not find it, it outputs an error, then finds the hipEnvVar at the same level. Currently the fix will prompt the test to only output an error if both searches for hipEnvVar fails.
* When assertion is used towards the later half of the test, conditions were set to specifically hide the devices, resulting in No Hip Device detected in the latter half of the test. The fix will make these errors not appear as they are intended to not find any devices. Assertions themselves are untouched.

HipEnvVarDriver.cpp has also been refactored. Reading HipEnvVar will now happen in a helper function for getDeviceNumber and getDevicePCIBusNumRemote, as the code to read HipEnvVar were really similar in them.
2020-02-28 16:46:12 +05:30
Alex Voicu d830dad3be Address post-staging issues in #1809 (#1894)
Fixes SWDEV-223910 and SWDEV-223663
2020-02-27 16:21:12 +05:30
Maneesh Gupta 71e1f87f7e bump version to 3.2 (#1898)
- Bump version to 3.2
- [ci] Enable tests on ROCm 3.1
2020-02-27 16:18:31 +05:30
Nick Curtis b7dd073d93 fix long shuffle implementations for windows (#1895)
Fixes for SWDEV-223694
2020-02-26 15:53:56 +05:30
Yaxun (Sam) Liu 69404d8e78 Fix hipcc for extra -mllvm option (#1885) 2020-02-26 15:53:43 +05:30
Sarbojit2019 c1a70707e0 [HIPIFY] Add back missing execute permission to hipify-perl (#1881)
hipify-perl script lost its executable permission hence "samples/0_Intro/square" was failing. Fixes SWDEV 223433.
2020-02-19 13:48:20 +05:30
eshcherb 82ec3c1c5b adding hipExtModuleLaunchKernel to tracing layer (#1880) 2020-02-19 13:47:49 +05:30
Alex Voicu 9b4f39e1d8 Tweak synchronous memcpy implementation (#1809)
The existing one can have issues on certain systems, therefore this limits use of direct memcpy via largeBAR to sizes where it is unequivocally better.

Also addresses SWDEV-220030 and SWDEV-222237.
2020-02-18 20:50:27 +05:30
Yaxun (Sam) Liu 92cc29ae2b Let HIP-Clang inline all functions by default (#1875)
This is a quick workaround to match HCC behavior for performance since inlining usually
results in more optimization opportunities therefore better performance.

We will fine tuning inline threashold later.
2020-02-17 22:49:26 +05:30
Rahul Garg 8c5e5e435b Fix hipMemcpy3D (#1798)
Fixes #1790 and #1791. hipMemcpy3D still requires further refactoring for different input and output combinations.
2020-02-17 19:35:35 +05:30
Maneesh Gupta 854afef281 [dtests] Fix random timeout failures in hipModuleLoadDataMultThreaded (#1877)
Limit the max threads that are launched to 16.
2020-02-17 11:16:20 +05:30
vsytch 56b8b0d80e Add missing __hip_pinned_shadow__ attributes to the texture global vars. (#1866) 2020-02-15 09:52:25 +05:30
Maneesh Gupta e7120dd876 Use deque instead of vector for code readers so that the iterators and references will be stable (#1851)
* Use deque instead of vector for code readers so that the iterators and references will be stable

* Fix compile error

* Assign the iterator

* Add multithreaded test

* Make threads a multiple of hardware concurrency

* Output on failure

* Add setDevice to try and initialize the context on cuda

* Create context for cuda

* Set context on each thread

* Reduce threads on cuda

* Skip test on cuda

* Try to initialize the primary context on cuda

* Push ctx to the stack as current

* Revert "Push ctx to the stack as current"

This reverts commit bff8cbe950.

* Revert "Try to initialize the primary context on cuda"

This reverts commit fd98514113.

* updated test for nvidia path

* Add c++11 option for nvcc

Co-authored-by: satyanveshd <53337087+satyanveshd@users.noreply.github.com>
2020-02-15 09:51:24 +05:30
Nick Curtis 797a929a65 Implement long / long long shuffles (#1829)
Implement additional data-types for shuffles (long and long long).
Based upon the double implementation.
2020-02-15 09:51:09 +05:30
Siu Chi Chan f2ab87d872 Disabling HCC code object v3 generation by default.
Some PyTorch unit tests have regression.  Disabling cov3 to allow more
time to debug and unblock PyTorch

Change-Id: Iba7f425ef3499c20c42ec45d9152b5d27ce97d03
2020-02-14 19:39:27 -05:00
Evgeny Mankov 9a9319c8e7 Merge pull request #1870 from emankov/HIP
[HIPIFY][doc] Update README.md: LLVM 10.0.0-rc2 - the latest supported LLVM Release
2020-02-14 13:25:43 +03:00
Evgeny Mankov 115f45d116 [HIPIFY][doc] Update README.md: LLVM 10.0.0-rc2 - the latest supported LLVM Release 2020-02-14 13:09:31 +03:00
Rahul Garg 9d97f91fbb [sample] Add hipDispatchEnqueueRateMT (#1869)
* [sample] Add hipDispatchEnqueueRateMT
2020-02-13 23:21:40 -08:00
Evgeny Mankov 5f2438a6c6 Merge pull request #1867 from emankov/HIP
[HIPIFY][doc] Update README.md: Windows tested configurations
2020-02-13 18:48:38 +03:00
Evgeny Mankov 084b2fa0f6 [HIPIFY][doc] Update README.md: Windows tested configurations 2020-02-13 18:34:10 +03:00
Satyanvesh Dittakavi 3fb4135946 Add c++11 option for nvcc 2020-02-13 19:48:26 +05:30
Satyanvesh Dittakavi ead254cdd5 updated test for nvidia path 2020-02-13 16:34:05 +05:30
Jeff Daily 03bb658721 missing break statement in hipDeviceGetAttribute (#1865)
The break is missing for hipDeviceAttributeMaxTexture3DDepth.
2020-02-13 14:22:56 +05:30
Sarbojit2019 1109cbff83 [hip] Fix for bug introduced in #1770 when blockSize is non-power of 2 (#1864)
Fixes SWDEV-222161
2020-02-13 14:22:46 +05:30
Sarbojit2019 fc5256fd28 ihipEnablePeerAccess return error if peer is not accessible (#1858)
hipDeviceEnablePeerAccess returns success and adds peer into the list even if it is not accessible which creates problem in hipMalloc when it tries to share the ptr to peer device.
Proposed change is to check the access status before updating the peer list and update only when it can access the peer.
2020-02-13 14:22:11 +05:30
ansurya 8c6934223b Reduce GPU copying based on arch it runs on (#1751)
Implements SWDEV-213230.
2020-02-13 14:21:51 +05:30
Evgeny Mankov 2536a3093d Merge pull request #1830 from asalmanp/coop_flag_define
[HIP][HIPIFY] Add some missing flags for cooperative launch and occup…
2020-02-12 14:07:39 +03:00
Paul 26bb6a97a7 Revert "Try to initialize the primary context on cuda"
This reverts commit fd98514113.
2020-02-11 12:34:11 -06:00
Paul e82e3c2339 Revert "Push ctx to the stack as current"
This reverts commit bff8cbe950.
2020-02-11 12:34:10 -06:00
Paul bff8cbe950 Push ctx to the stack as current 2020-02-11 11:46:29 -06:00
Paul fd98514113 Try to initialize the primary context on cuda 2020-02-11 11:26:24 -06:00
Aryan Salmanpour 959f1b0f0e fix build error in nvcc path 2020-02-11 12:16:51 -05:00
Jatin Chaudhary ab7526f64c Revert "Sync hip-targets*.cmake in package with install changes (#1831)" (#1860)
Fixes SWDEV-222155 & SWDEV-222158
This reverts commit 6891615a15.
2020-02-11 11:56:57 +05:30
Paul b9f97ec3fe Skip test on cuda 2020-02-10 17:23:58 -06:00
Paul 30e8dfdd86 Reduce threads on cuda 2020-02-10 16:37:34 -06:00
Paul 2d9a2d866c Set context on each thread 2020-02-10 16:01:53 -06:00
Paul e5d077f70e Create context for cuda 2020-02-10 15:52:34 -06:00
Paul 29a257d79b Add setDevice to try and initialize the context on cuda 2020-02-10 13:37:45 -06:00
Aryan Salmanpour 5a29f27455 Fix a typo causing a build error 2020-02-10 11:44:40 -05:00
Aryan Salmanpour 874b201ee2 resolve merge conflict 2020-02-10 10:30:55 -05:00
Maneesh Gupta 6614ae33e0 gedit/hip.lang does not need a seperate license 2020-02-10 16:27:20 +05:30
Maneesh Gupta 9acdcf27c5 Update copyright section in gedit/hip.lang 2020-02-10 16:25:38 +05:30
Maneesh Gupta f8e1c01900 Revert "Match Occupancy APIs syntax with CUDA (#1625)" (#1857)
Reverting this for now till we figure out how to avoid the build
breakage.

This reverts commit fa98798b63.
2020-02-10 10:45:28 +05:30
Alex Voicu dd34ea95d6 (Maybe) Match alignment between Clang and GCC. (#1789)
Should fix #1740 and the related internal bug.
2020-02-10 10:44:49 +05:30
mhbliao a01b262660 [hip] Cleanup compiler wrapper for HIP-Clang. (#1847) 2020-02-07 13:28:26 -08:00
Paul d77ede7015 Output on failure 2020-02-07 10:13:28 -06:00
Paul 8e494cfce8 Make threads a multiple of hardware concurrency 2020-02-06 16:23:29 -06:00
Paul 5361424702 Add multithreaded test 2020-02-06 16:21:40 -06:00
Michael LIAO 66678b0170 [hipcc] Skip warning on gfx000.
- The known target checking should skip `gfx000` as well as it won't be
  used in real compilation command formation. The avoid generating
  annoying warning on `gfx000`.
2020-02-06 17:09:14 -05:00