Commit Graph

3565 Commits

Author SHA1 Message Date
Evgeny Mankov 355f49a850 [HIPIFY][tests] Add reverse engineered HIP sample "stream"
+ Add additional checks for extern __shared__ due to [#1109]


[ROCm/clr commit: 7cc12df514]
2019-05-15 20:17:03 +03:00
Evgeny Mankov bbe9275e38 [HIPIFY][fix][#1109] Do not preserve extern __shared__ for IncompleteArrayType
+ Update tests accordingly


[ROCm/clr commit: bf65120156]
2019-05-15 20:05:56 +03:00
Evgeny Mankov b866a2e1d1 Merge pull request #1104 from emankov/master
[HIPIFY][tests] Add reverse engineered HIP sample Profiler

[ROCm/clr commit: a0f10ebaa2]
2019-05-14 16:59:36 +03:00
Evgeny Mankov 7a5a838e80 [HIPIFY][tests] Add reverse engineered HIP sample Profiler
+ Add missing cuda_profiler_api.h to hip/hip_profile.h transformation.
NOTE: HIP Profiler API is under development. This is NOT WORKING example.
TODO: Find out a way to generate HIP_SCOPED_MARKER, HIP_BEGIN_MARKER, HIP_END_MARKER, declared in hip/hip_profile.h in particular place (signatures are to obtain).


[ROCm/clr commit: 5e49c25faa]
2019-05-14 16:43:44 +03:00
Evgeny Mankov a973fba09f Merge pull request #1102 from emankov/master
[HIPIFY][tests] Add reverse engineered HIP sample hipEvent

[ROCm/clr commit: 5da723222e]
2019-05-13 22:14:41 +03:00
Evgeny Mankov 8c12edcf65 [HIPIFY][tests] Add reverse engineered HIP sample hipEvent
[ROCm/clr commit: 9860dac7fa]
2019-05-13 22:12:43 +03:00
Evgeny Mankov 612afc5fed Merge pull request #1101 from emankov/master
[HIPIFY][tests] Add reverse engineered HIP sample MatrixTranspose

[ROCm/clr commit: 4cf53fb09c]
2019-05-13 19:39:16 +03:00
emankov 39b28d7623 [HIPIFY][tests] Add reverse engineered HIP sample MatrixTranspose
[ROCm/clr commit: cdc76af186]
2019-05-13 19:37:18 +03:00
Wen-Heng (Jack) Chung e92ffd2261 Revert "HACK for SWDEV-173477" (#1004)
* Revert "HACK for SWDEV-173477"

This reverts commit 86379d694f.

[ROCm/clr commit: a4db991cbf]
2019-05-13 14:42:05 +05:30
Maneesh Gupta e0e30536e6 Merge pull request #1083 from gargrahul/fix_hip_impl_visible_agents
Maintain HIP_VISIBLE_DEVICES for kernel launch

[ROCm/clr commit: c9fdb42b91]
2019-05-13 14:20:18 +05:30
Rahul Garg d44e800a17 Add fine grained host memory lock support (#1095)
* Add fine grained host memory lock support

* Fix default flag check


[ROCm/clr commit: e1f3dc0c80]
2019-05-13 11:48:26 +05:30
Nick Curtis 3b6b356d23 Markdown fixes & Whitespace cleanup for samples (#1096)
* Fix multiline code blocks in README's

* Whitespace cleanup


[ROCm/clr commit: fb92feae0e]
2019-05-12 19:27:44 +05:30
Maneesh Gupta 89da742110 Merge pull request #1094 from mangupta/hit_improvements
[dtests] Add new tests to directed tests

[ROCm/clr commit: 0cc7fe8a9f]
2019-05-12 19:25:21 +05:30
Siu Chi Chan 76f535b4ce migrate program_state logic from header into shared library (phase I) (#1077)
* Revert "Revert "Use COMgr to read Kernel Args Metadata (#1006)""

This reverts commit f8d108a815.

* Revert "Use COMgr to read Kernel Args Metadata (#1006)"

This reverts commit 10048a5631.

* Revert "improve program state commentary"

This reverts commit 5233d41c6c.

* Revert "load program state once per agent"

This reverts commit 9cee2c5311.

* start moving function_names() into the hip shared lib

* start moving code_object_blobs to a new "state" object

* Consolidate various program state related static objects into a
single program_state object

* minor clean up

* move more stuffs from functional_grid_launch into program_state

* debug make_kernarg

* moving lookup for kernargs size_align into program_state

* clean up old code for kernarg size and alignment

* update hip_module to use newer api in program_state

* Create public member functions for program_state

* move most program state functions into shared library

* Pass the data buffer size to load_executable
Otherwise, it can't figure what the data size is
just from the char* (since the data is not really a string)

* turning free functions in program state into members of program_state_impl

* change the free function globals() into a member of program_state_impl

* replace the static mutex used for populating globals

* moving associate_code_object_symbols_with_host_allocation into
program_state_impl

* move load_code_object_and_freeze_executable into program_state_impl

* moving executables and functions_names into program_state_impl

* moving kernels() into program_state_impl

* moving functions() into program_state_impl

* move get_kernargs into program_state_impl

* moving kernel_descriptor into program_state_impl

* moving kernargs_size_align calculation into program_state_impl

* Changing the handle to program_state_impl to a pointer

* moving program_state_impl into a separate inline source file

* fixing/cleaning up some header file includes

* moving member function for kernargs_size_align into program_state.cpp

* moving Kernel_descriptor into program_state.inl

* add a new class to manage agent globals

* moving all agent globals processing functions into agent_globals_impl

* load program state once per agent

re-merging PR991 against other program state changes

* fix per-agent program state member initialization

* cache executables based on elf name, isa, and agent.

This avoids program state reloading executables after a shared library is dlopened.

re-merging PR1057 against other program state changes

* protect executables cache by a global mutex

* return ref to executables cache

* adapt PR#981 Make hipModuleGetGlobal be in HIP runtime


[ROCm/clr commit: 05a1b696da]
2019-05-12 19:24:03 +05:30
Maneesh Gupta 54f932e569 Merge pull request #1084 from mhbliao/hliao/master/api_ext
[hip] Add API `hipExtModuleLaunchKernel` in HIP runtime

[ROCm/clr commit: e78a09c041]
2019-05-09 18:26:31 +05:30
Maneesh Gupta 3d098c6a1c [dtests] Fix hipModule test for nvcc path
Change-Id: If918b87b848a825242e06b0d552a7be188a1c4b6


[ROCm/clr commit: 6e573ba430]
2019-05-09 18:17:19 +05:30
Maneesh Gupta 3b75006961 [dtests] Add complex_loading_behavior test
Change-Id: Iadf135cb727a1a3761abef20336d652b159c7dcd


[ROCm/clr commit: e95f7fc1f8]
2019-05-09 18:03:42 +05:30
Maneesh Gupta f46fafd3a4 [dtests] Add hipModule test to unit tests
Change-Id: I1dac38f8580265e2e9c82d88e4f070a2ff87f60b


[ROCm/clr commit: 4b38188e1e]
2019-05-09 11:36:46 +05:30
Maneesh Gupta 1b896685a3 [hit] Add support for BUILD_CMD
[ROCm/clr commit: dac20b7736]
2019-05-09 11:36:26 +05:30
Maneesh Gupta 68cf672441 [hit] Remove CUSTOM_CMD
Change-Id: Ia156fe6aab9cfcc11284823ea5131e33eaf962bc


[ROCm/clr commit: db52b0f60f]
2019-05-09 09:59:18 +05:30
Maneesh Gupta 74110bbd43 [hit] Rename RUN -> TEST & RUN_NAMED -> TEST_NAMED
Change-Id: I75e24f15129973cee15fc9dac65d678bd2172074


[ROCm/clr commit: 53dd1df3fa]
2019-05-09 09:59:18 +05:30
Evgeny Mankov 3e3140c9f7 Merge pull request #1090 from emankov/master
[HIPIFY][python] Initial support of hipify-python generation from hipify-clang

[ROCm/clr commit: b6ff82a2e6]
2019-05-08 19:12:08 +03:00
Evgeny Mankov 197affbe2a [HIPIFY][python] Initial support of hipify-python generation from hipify-clang
+ Only a generation of transformation map of CUDA entities is implemented.
+ 2 hipify-clang options are added: -python, -o-python-map-dir.
+ Explicitly set -roc option for cuda_to_hip_mappings.py generation.
+ Generated file already might be used by pytorch team.


[ROCm/clr commit: 0f4affde9c]
2019-05-08 19:08:55 +03:00
Evgeny Mankov d1661fa10d Merge pull request #1089 from emankov/master
[HIPIFY][perl] Support of hipify-perl generation from hipify-clang: n…

[ROCm/clr commit: c7243ebdae]
2019-05-08 15:59:45 +03:00
Evgeny Mankov 51fe4163c0 [HIPIFY][perl] Support of hipify-perl generation from hipify-clang: next steps
+ Generate transformation map sorted by entity type.
+ Add a generation of supported header files.


[ROCm/clr commit: 82efa64f55]
2019-05-08 15:25:06 +03:00
Maneesh Gupta d5f6a19543 Merge pull request #1088 from ROCm-Developer-Tools/mangupta-patch-1
[ci] Enable tests on ROCm 2.4

[ROCm/clr commit: b3c159344e]
2019-05-08 12:44:02 +05:30
Maneesh Gupta 63e2d774a0 [ci] Enable tests on ROCm 2.4
[ROCm/clr commit: a78e719835]
2019-05-08 12:07:33 +05:30
Evgeny Mankov 2cdf3d7e73 Merge pull request #1085 from emankov/master
[HIPIFY][perl] Initial support of hipify-perl generation from hipify-clang

[ROCm/clr commit: df8909d73d]
2019-05-07 17:30:39 +03:00
Evgeny Mankov 413e0f97fb [HIPIFY][perl] Initial support of hipify-perl generation from hipify-clang
+ Only a generation of transformation map of CUDA entities supported by HIP is implemented.
+ 3 hipify-clang options are added: -perl, -o-perl-map, -o-perl-map-dir.
+ OptionsParser mode is changed from OneOrMore to Optional to support hipify-perl generation without actual hipification.
+ Add explicit control of source files specification absence in case of no perl generation.


[ROCm/clr commit: 849155d865]
2019-05-07 17:27:34 +03:00
Maneesh Gupta 301a9292ff Merge pull request #1082 from gargrahul/fix_hipmemcpy_symbol_nvcc
Fix symbol address issue on NVCC path

[ROCm/clr commit: c6cf2a9e26]
2019-05-07 16:17:01 +05:30
Maneesh Gupta 30c7ed3e28 Merge pull request #1081 from mangupta/swdev-181624
Implement hipExtGetLinkTypeAndHopCount for ROCm devices

[ROCm/clr commit: c6c5e4cee8]
2019-05-07 16:15:41 +05:30
Maneesh Gupta 532725c9c8 Merge pull request #1075 from mhbliao/hliao/master/test_fix2
[test] Add device variant of `std::declval`.

[ROCm/clr commit: 51e158c633]
2019-05-07 16:15:01 +05:30
Maneesh Gupta dce65678d7 Merge pull request #1074 from mhbliao/hliao/master/test_fix
[test] Use explicit cast for address space cast.

[ROCm/clr commit: 7f759750d1]
2019-05-07 16:09:15 +05:30
Maneesh Gupta fb08e0f25e Merge pull request #1073 from kpyzhov/multi-thread-device-test
hipMultiThreadDevice test: Reduced maximum number of created HIP stre…

[ROCm/clr commit: d71afeccc8]
2019-05-07 16:08:37 +05:30
Maneesh Gupta e514de5b33 Merge pull request #1072 from kpyzhov/master
Refined hipSetDevice test.

[ROCm/clr commit: 8f352427f4]
2019-05-07 16:07:36 +05:30
Maneesh Gupta 4a4745e466 Merge pull request #1069 from mhbliao/hliao/master/test_cleanup
[test] Remove unused common routines.

[ROCm/clr commit: 0fffbbe67a]
2019-05-07 16:02:57 +05:30
Maneesh Gupta 46d0385435 Merge pull request #1068 from mhbliao/hliao/master/dev_vec_func
[devfunc] Add necessary `__device__` and `__host__` attributes.

[ROCm/clr commit: 11972049c6]
2019-05-07 16:01:48 +05:30
Yaxun (Sam) Liu 01ef00b568 Add documentation for supported clang options (#1065)
* Add documentation for supported clang options

* Fix typo


[ROCm/clr commit: 0b43b24d3f]
2019-05-07 15:59:40 +05:30
wkwchau 7eaaf6f1ae Return hipErrorInsufficientDriver status when CPU device not found (#1064)
* Return hipErrorInsufficientDriver status when CPU device not found - no exception thrown

* Return hipErrorInsufficientDriver status when CPU device not found


[ROCm/clr commit: ebf986dcee]
2019-05-07 15:58:25 +05:30
Maneesh Gupta c6e14467f7 Merge pull request #1061 from mhbliao/hliao/master/hipcc
[hip] Repace `--rpath` with `--rpath-link`

[ROCm/clr commit: 46ac83a429]
2019-05-07 15:57:57 +05:30
Maneesh Gupta 0364d5c710 Merge pull request #1054 from ssahasra/dry
minor cleanup: eliminate repetition

[ROCm/clr commit: 1a1feb600f]
2019-05-07 15:57:46 +05:30
Michael LIAO d94d566410 [hip] Add API hipExtModuleLaunchKernel in HIP runtime
[ROCm/clr commit: de768c22ae]
2019-05-06 21:20:28 -04:00
Rahul Garg 3f65bec096 Maintain HIP_VISIBLE_DEVICES for kernel launch
[ROCm/clr commit: 3be54a903c]
2019-05-07 05:09:02 +05:30
Rahul Garg bf3bafb9f5 Fix symbol address issue on NVCC path
[ROCm/clr commit: 6cbc70d238]
2019-05-07 03:59:43 +05:30
Maneesh Gupta f657eba4a5 Implement hipExtGetLinkTypeAndHopCount for ROCm devices
Change-Id: Ie5bb4f640ac6d189c7fceeab22627a7494fd10bd


[ROCm/clr commit: 2f43f110d9]
2019-05-06 15:54:31 +05:30
Michael LIAO f644d4daaa [test] Add device variant of std::declval.
- Current clang disallows any invocation of wrong-side functions even
  under context with type-inspection only. Work around that by adding a
  variant of `std::decl` with `__device__` attribute.


[ROCm/clr commit: 32f69c8bc4]
2019-05-03 15:58:31 -04:00
Michael LIAO fa74e75fc1 [test] Use explicit cast for address space cast.
[ROCm/clr commit: a27877794f]
2019-05-03 14:56:00 -04:00
Maneesh Gupta 13b13c3493 Merge pull request #1062 from mhbliao/hliao/master/icmp
[hip] Re-implement ballot using AMDGCN builtins

[ROCm/clr commit: 2eafa5dcf9]
2019-05-03 17:48:19 +05:30
Maneesh Gupta c73c864fdf Merge pull request #1058 from mhbliao/hliao/master/devfunc
[Device Function] Fix implementation

[ROCm/clr commit: ad070d4da5]
2019-05-03 17:47:51 +05:30
Konstantin Pyzhov 1be1dd207a hipMultiThreadDevice test: Reduced maximum number of created HIP streams on Windows.
[ROCm/clr commit: e04e408a37]
2019-05-03 05:43:30 -04:00