rocm-systems

Upphovsman	SHA1	Meddelande	Datum
Sean Keely	495c1da5f3	Pull from github (tstellar): Prefer using memfd_create() for the ring buffer. We were using /dev/shm, but this won't work on systems that either don't have /dev/shm or have mounted it with noexec, because for everything other than gfx700 we map the ring buffer with PROT_EXEC. memfd_create() is Linux specific and was added in Linux 3.17, so we will fallback to using /dev/shm on systems where memfd_create() is not available. Change-Id: I58fb533eebc362f6d29dc3e316a80801014d50e8 [ROCm/ROCR-Runtime commit: `b93ffafdc7`]	2017-11-28 20:47:12 -05:00
Sean Keely	37132e4a21	Improve loop variables. Derived from github pull request by folklore1984. Change-Id: I70cd3da131691543fed8bf913d6245d41c49280d [ROCm/ROCR-Runtime commit: `4b603e803d`]	2017-11-28 20:36:22 -05:00
Sean Keely	7ed62f815f	Pull from github (pmargheritta): Corrected semantics used in hsa_queue_load_write_index_relaxed. The semantics that was used in hsa_queue_load_write_index_relaxed didn't seem to match the name of the function. I also removed a useless return keyword. Change-Id: If3819d38fb367f122fc382edf8ee3771a23279ae [ROCm/ROCR-Runtime commit: `5872b618de`]	2017-11-28 20:35:50 -05:00
Oak Zeng	90f35469d6	Cosmetic changes in events.c Change-Id: Idecb8eede8811020b3af51cbc71da74849029c82 Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> [ROCm/ROCR-Runtime commit: `cce57cec26`]	2017-11-28 15:20:51 -05:00
Oak Zeng	5883a2c86b	More cleanup of fmm.c 1. Renamed _fmm_map_to_gpu to _fmm_map_to_apu_local to reflect the real semantics of this function 2. Renamed _fmm_map_to_gpu_gtt to _fmm_map_to_gpu because this function is used to map both gtt and local memory 3. Call _fmm_map_to_gpu in _fmm_map_to_apu_local to get rid of duplicated codes Change-Id: Id8e3ebfffe0a3c27ebdcac8a8f4dc3738d67d10a Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> [ROCm/ROCR-Runtime commit: `019f7cbd20`]	2017-11-27 18:47:35 -05:00
Oak Zeng	b1a482dd52	Cleanup fmm.c 1. Initialize pointers to NULL in vm_create_and_init_object 2. Added helper function to add/remove device ids to/from mapped arrary 3. Only map nodes that were not mapped currently 4. Remove unnecessary condition check on object frees Change-Id: I7aed6d40c7464be0d168d5796229af55451e0f34 Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> [ROCm/ROCR-Runtime commit: `b4c89c1ea7`]	2017-11-27 18:47:23 -05:00
Amber Lin	b3439f48c2	Add debug message in PMC trace Print data in PMC trace when the debug level is set to 7(pr_debug). Change-Id: I9abbb8f6c3f7962fb637528578c1a58b7784042d Signed-off-by: Amber Lin <Amber.Lin@amd.com> [ROCm/ROCR-Runtime commit: `6f7b55f2d8`]	2017-11-22 10:09:49 -05:00
Oak Zeng	7486e8e29f	Fix unconditional unmap in fmm_map_to_gpu_nodes _fmm_unmap_from_gpu is called in fmm_map_to_gpu_nodes to unmap buffer from nodes that is already mapped to but not in the new map nodes list. Previously, the unmap was called unconditionally even though the size of the array to unmap is 0. This fixes the issue by calling the unmap func only when the unmap array size is not 0. Also releases the fmm_mutex on error returns Change-Id: Iadd8383caf7ebb92f02618798c5efd138a352aaa Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> [ROCm/ROCR-Runtime commit: `061db45fe2`]	2017-11-21 15:16:39 -05:00
Oak Zeng	84d88b8e1a	Properly control lifecycle of ptr info objects Buffer mapping to devices and buffer registration to devices can be changed b/t two pointer info queries. Thus update buffer mapping info and registration info only when mapping and registration changed. This is done by free mapped_node_id_array on mapping to new device and free registered_node_id_array on registration and re-allocate them on next ptr info query. Also uses fmm_mutex to avoid race conditions in case of calling hsaKmtQueryPointerInfo concurrently with calling of buffer mapping or registration Change-Id: Ibc2e20be1fc0147066f873dfa44b21f5015104b7 Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> [ROCm/ROCR-Runtime commit: `f06e887725`]	2017-11-21 15:16:29 -05:00
Evgeny	acce35d21f	_aqlprofile_start() API migration Change-Id: I7c8c7a6fc6f9b20cc2e4074dde38fb19440927f1 [ROCm/ROCR-Runtime commit: `86939368d1`]	2017-11-20 17:32:19 -05:00
Sean Keely	a62c1f5364	Use a default module search path if not already specified. Change-Id: I782f0b758dc908c25abeb7f3536418cb5a48ac5e [ROCm/ROCR-Runtime commit: `e81d04e11b`]	2017-11-20 14:12:27 -05:00
Amber Lin	d2467747f1	Correct command in make package cmake command in making packages was not updated. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Change-Id: Iafd3d9f4941d782bd77cfd0efafe48a02221b002 [ROCm/ROCR-Runtime commit: `61d1c6ffac`]	2017-11-20 10:28:25 -05:00
Chris Freehill	ed0537ed67	Device ID/family corrections for gfx9xx Change-Id: Icb25fbbaeb99ce886a2852b48d02875ee0f197a2 [ROCm/ROCR-Runtime commit: `651ae1bf70`]	2017-11-16 07:27:54 -05:00
Evgeny	acaf0d0aac	aqlprofil API: removing from HSA hsa_api_trace/hsa_ext_interface Change-Id: I12fac55ea9ccfdb119899bf9d000e3c8b0bf4bbb [ROCm/ROCR-Runtime commit: `6e1b9288f6`]	2017-11-11 10:01:12 -06:00
Evgeny	fd81986bb2	aqlprofile API: _aqlprofile_start() returns required profile buffer sizes if undersized Change-Id: Ib14b2cb2e7e2026c3af0b7bd2f08f51e48e598b2 [ROCm/ROCR-Runtime commit: `bb8eaf3ac8`]	2017-11-09 20:03:55 -06:00
Oak Zeng	11f961f499	Added "-g" to CFLAGS for debug build Previously even for debug build, -O2 is used. So there wasn't debug information in the debug build. Change-Id: I6334474e007480eb2db191bdfec5a71677c26a52 Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> [ROCm/ROCR-Runtime commit: `d4e6ec0ff0`]	2017-11-09 07:43:36 -05:00
Sean Keely	76c28f38c4	Fix bad casts in tools. Also virtualize queue profiling enable. Change-Id: I761b41269be3df7eb64a5914ee9951ed6b51bb04 [ROCm/ROCR-Runtime commit: `6455a69b03`]	2017-11-08 15:50:02 -05:00
Sean Keely	0fcdd63d88	Add callback exception forwarding. Modified callbacks for intercept queue, queue error, iterate agent and iterate region. Change-Id: I8bdd67f2312510ea7eb9caec93babca244938b40 [ROCm/ROCR-Runtime commit: `a6d8a48cbc`]	2017-11-08 15:50:02 -05:00
Sean Keely	e2efba0676	Exception support for Queue. Remove "zombie" queue state and report queue creation failure via exceptions. Make Shared object a final container and support array objects with Shared. Add message printing to hsa_exception in debug builds. Change-Id: I459f38c80846018acbf45538874e95f91dd6b195 [ROCm/ROCR-Runtime commit: `f312a7386e`]	2017-11-08 15:50:02 -05:00
Sean Keely	2406218416	Add queue intercept support to the runtime. Queue intercept is exposed as two tools-only APIs via the API intercept table. Change-Id: Iac9602ed3143974d85c3569e9092295ad18037f8 [ROCm/ROCR-Runtime commit: `0c7dde2d1f`]	2017-11-08 15:50:01 -05:00
Oak Zeng	14121397cd	Correctly handle max_map_count limit after failed memory allocation Also separated a function for removing CPU mapping and reserving address, as a refactoring of codes Change-Id: I1feb85b0b2ec942487f899ec3192c7c47dd7c7d5 Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> [ROCm/ROCR-Runtime commit: `07110fbd38`]	2017-11-08 10:05:04 -05:00
Chris Freehill	62b331353a	Re-enable IPC test Fix for fixes this. Change-Id: I63a8d1a16d5029f240f075bb97ab6a1156b5cab2 [ROCm/ROCR-Runtime commit: `473be763ff`]	2017-11-08 10:02:51 -05:00
Kent Russell	c52d3b6997	Revert "aqlprofile API: _start() sets buffers sizes with NULL ptr; block counters reg number / block name info" This reverts commit `8518a48d4f`. Change-Id: Ie90b091df772bf9391494c773d63858aafbc1176 [ROCm/ROCR-Runtime commit: `b29d3f63e2`]	2017-11-08 06:59:33 -04:00
Evgeny	8518a48d4f	aqlprofile API: _start() sets buffers sizes with NULL ptr; block counters reg number / block name info Change-Id: I3cb93453b683c55bf5ec26271648232306a5d140 [ROCm/ROCR-Runtime commit: `3daa85fad8`]	2017-11-07 15:05:47 -05:00
Kent Russell	1222b7c9fb	CMakeLists: Make roct-dev dependent on roct Change-Id: Ib7d2927087dcd53da7916951de9d6a71ae6bb21b [ROCm/ROCR-Runtime commit: `c704ff60b3`]	2017-11-07 06:43:37 -05:00
Amber Lin	6ba2a9b764	Use absolute path on cmake parameter Update build instructions in README.md to use absolute path on cmake parameter, CMAKE_MODULE_PATH. Relative path causes build error. Tested on cmake 3.5.1 ans cmake 3.5.2. Change-Id: I1b8e8deb9f4941580580be8087a94655ae155d02 Signed-off-by: Amber Lin <Amber.Lin@amd.com> [ROCm/ROCR-Runtime commit: `310e3d7b8b`]	2017-11-02 17:34:46 -04:00
Oak Zeng	e305dc9c82	Use drm render device to map kfd BOs Previously kfd device is used to map memory for CPU access. However this is not compatible with how TTM handles CPU mapping on eviction - memory won't be unmapped and remapped on restore. This fixes the issue by mmapping memory using DRM render device. This patch requires a coordinated kernel driver change to work. To make it compatible with old kernel driver, some temporary codes are included. Once the coordinated kernel driver is checked in, the temporary codes can be removed. Change-Id: Ie7b304c4a82b7e8d5ab703acb81d66430af4f0bc Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> [ROCm/ROCR-Runtime commit: `68a2d286ca`]	2017-11-02 09:06:26 -04:00
Shaoyun Liu	a068301408	Add asic id for gfx906 on emulator On thunk level, gfx906 works same as gfx900 chip Change-Id: I727bd904284616f3b1b9b911e41ad0f19318b3ee Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> [ROCm/ROCR-Runtime commit: `55fc06dac3`]	2017-10-31 14:58:09 -04:00
Sean Keely	cb47089c17	Make HostQueue::queue_count_ a portable atomic type. Also make lint happy. Change-Id: I0f965df6a76fd959df9eb411d1f1b11847159790 [ROCm/ROCR-Runtime commit: `d93f92f42d`]	2017-10-31 02:38:25 -05:00
Qingchuan Shi	3e9a0561c0	Add APIs to support debugging vm fault 1. Add hsa ext api hsa_amd_register_vmfault_handler for debugger to register callback in case of VM fault. 2. Extend hsa_ven_amd_loader API to: (1) iterate loaded code objects in executable: hsa_ven_amd_loader_executable_iterate_loaded_code_objects (2) get loaded code object info: hsa_ven_amd_loader_loaded_code_object_get_info 3. Make the id of hsa_queue the same as the one used in communication with thunk (for amd_aql_queue) Change-Id: I68910809e59e24297350d262606f00e96c14bcbd [ROCm/ROCR-Runtime commit: `ce6aee01ed`]	2017-10-28 21:48:26 -04:00
Philip Yang	ac80bac82a	Fix double free on fork after hsaKmtCloseKFD Child process hsaKmtOpenKFD() call must re-initialize global variables copied from parent process. This includes close all file handles, free dynamically malloc buf. Double free issue is because destroy_device_ debugging_memory() free the memory in parent process hsaKmtCloseKFD() but don't reset it to null pointer. As a result, child process free it again. kfd_fd is closed in parent process but don't reset to 0, so child process close it again. Fix: reset kfd_fd to 0 after close, reset is_device_debugged pointer to 0 after free Change-Id: I421b3decbcaa4111298b8e599aa16940d851a58c Signed-off-by: Philip Yang <Philip.Yang@amd.com> [ROCm/ROCR-Runtime commit: `3501b2f40d`]	2017-10-26 15:36:15 -04:00
Sean Keely	718134a369	Remove make build file. Change-Id: I86abae4c44b6c606fb850eff6d44cdbf30cf59f5 [ROCm/ROCR-Runtime commit: `aece2f8fc2`]	2017-10-26 01:12:31 -04:00
Jan Vesely	d28f85fb3a	cmake: make sure there are no undefined symbols Change-Id: Id5a268d7e512f71c1a65af598543eb60ae6c3596 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> [ROCm/ROCR-Runtime commit: `d8a8f88737`]	2017-10-25 17:46:58 -04:00
Jan Vesely	0dded82034	cmake: Use pkg-config to find libpci Change-Id: I1ab4397d88a2bd48ce0d6f2d3c33efcf47bc442f Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> [ROCm/ROCR-Runtime commit: `383f275aa7`]	2017-10-25 17:46:58 -04:00
Sean Keely	0a4a1a2433	Fix error message description. Change-Id: I32efed68e970a4882aca9decbbcda3fcd5c5cb43 [ROCm/ROCR-Runtime commit: `6ee2ccb08b`]	2017-10-24 21:52:21 -05:00
Sean Keely	8122fbebad	Set doorbell kind code for gfx9+ device enqueue. Change-Id: I93c4cea677ae51f97ac768614333743fb26b2f54 [ROCm/ROCR-Runtime commit: `5a4ab91be1`]	2017-10-21 11:08:44 -04:00
Sean Keely	5dfef3ef77	Improve build system handling of non-default directory layouts. Adds the thunk include and lib paths to the cache, removes paths to indicator files from the cache, uses the cached path directory (if any) as a search hint for indicator files. Change-Id: I0859faa8d229a97abfaacb408d2c831e317aed5f [ROCm/ROCR-Runtime commit: `a8d818a6bc`]	2017-10-21 11:08:15 -04:00
Sean Keely	41615ea7d5	Improve unhandled exception error reporting in debug builds. Change-Id: Ia92d1a93163105d817a2147d96f2edd399e2b70d [ROCm/ROCR-Runtime commit: `3cef9b1a04`]	2017-10-21 11:08:01 -04:00
Sean Keely	bdb5edad34	Fix memory leak in exception path. Change-Id: Iad5f035cd1909be4a8f1a1f5dd7ca5abec0694b4 [ROCm/ROCR-Runtime commit: `737966eb25`]	2017-10-21 11:08:00 -04:00
Chris Freehill	42bbc5ee85	Undo temporary namespace change Change-Id: I7f4c06f7037713db855b51367256cf4c7ba41860 [ROCm/ROCR-Runtime commit: `5a3230af66`]	2017-10-20 20:02:13 -05:00
Chris Freehill	ca41ce6730	Temporary change to namespaces to adjust for smi change Change-Id: Ic91bfb678912a82214f0a462a4b57e531f12977a [ROCm/ROCR-Runtime commit: `50a3d9402a`]	2017-10-20 13:12:06 -04:00
Ramesh Errabolu	a4d753615b	Changes to support surfacing of link weight as part of link info Change-Id: I1c0705a9374af1245f0419c51beded0d7ee10639 [ROCm/ROCR-Runtime commit: `dccbc9f2af`]	2017-10-20 12:09:31 -04:00
Yong Zhao	e5784493c7	Add SVM aperture on gfx902 Because of HW design change, GPUVM aperture is no longer needed on GFX9 APUs. However, on APUs some functionalities still depend on GPUVM aperture, so we choose to use SVM aperture instead to assume the functionality of previous GPUVM aperture. Change-Id: Ife7f0d598dd7989f2bcf7cdf3466d5a68703ca60 Signed-off-by: Yong Zhao <yong.zhao@amd.com> [ROCm/ROCR-Runtime commit: `3b852b4437`]	2017-10-16 15:06:09 -04:00
Felix Kuehling	b19d5e9f9a	Make system memory allocations NUMA aware Use mbind to specify the NUMA node for system memory allocation. This only works with HSA_USERPTR_FOR_PAGED_MEM=1. Change-Id: I88e7815d5a5aefcc4c22358c1a4a1635d7677ef3 Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> [ROCm/ROCR-Runtime commit: `cb4814eadc`]	2017-10-13 16:25:56 -04:00
Chris Freehill	26ebd1727c	Use Major/Minor/Step device numbers to differentiate gfx devices Change-Id: I0901871971a5b33018917ada6c0e69ac7aa91944 [ROCm/ROCR-Runtime commit: `a7cbe78366`]	2017-10-13 16:18:24 -04:00
Tom Stellard	7451594338	Don't mark heap memory as executable v3 Marking heap memory as executable using mprotect() is not allowed by SELinux. mprotect() calls that try to do this will fail on systems with SELinux enabled. This is also a security risk, so it should be fixed even on systems that allow this. Any memory we want to mark as executable must be allocated using mmap(). See https://www.akkadia.org/drepper/selinux-mem.html The two places where we try to mark heap memory as executable both use posix_memalign() to allocate the heap memory. In both cases, the alignment value passed into this function is always equal to PAGE_SIZE, which means that they are safe to replace with mmap(), which guarantees alignment to PAGE_SIZE. In this case PAGE_SIZE has been set to sysconf(_SC_PAGESIZE); v2: - Use MAP_PRIVATE instead of MAP_SHARED. This matches the behavior of memory allocated by posix_memalign() - Ignore alignment hints instead of returning error when we can't accommodate them. - Drop alignment parameter of allocate_exec_aligned_memory() since the only alignment supported is sysconf(_SC_PAGESIZE). - Remove extra parameter from fmm_release(). - Add error path to fmm_allocate_host_cpu() for when mmap fails. v3: - Avoid use after free. Change-Id: I7d51279790d9700bc3fa761c44bfde1c1936019b Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> [ROCm/ROCR-Runtime commit: `e2ed9cf79a`]	2017-10-13 14:05:58 -04:00
Tom Stellard	6b45ae6926	Build fixes for gcc 7.2 v2 src/perfctr.c: In function ‘destroy_shared_region’: src/perfctr.c:154:10: error: logical ‘and’ of equal expressions [-Werror=logical-op] if (sem && sem != SEM_FAILED) { ^~ src/perfctr.c: In function ‘update_block_slots’: src/perfctr.c:323:11: error: logical ‘or’ of equal expressions [-Werror=logical-op] if (!sem \|\| sem == SEM_FAILED) ^~ v2: - Initialize and reset sem to SEM_FAILED. Change-Id: Id70361079b715c4946b13e4460e4fd85d9542c46 Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> [ROCm/ROCR-Runtime commit: `4815d9b887`]	2017-10-13 11:37:38 -04:00
Sean Keely	93e58b9100	Capture more memory allocation types with the 2MB allocator. TensorFlow was running out of VRAM due to padding up allocations from legacy memory APIs. These allocations have been added to the fragment allocator to improve VRAM utilization. Change-Id: Ic680fff576a0434b3b17a4c91746da44e09957fa [ROCm/ROCR-Runtime commit: `4f299a9909`]	2017-10-12 23:22:10 -04:00
Amber Lin	a6fc0a8be1	Fix endless loop Fix a while loop that can cause forever loop when cpuid instruction doesn't work properly. Change-Id: Iefa49d23b40c994eb4369621974a7d3c4067e47a Signed-off-by: Amber Lin <Amber.Lin@amd.com> [ROCm/ROCR-Runtime commit: `5815d9de9b`]	2017-10-12 14:22:42 -04:00
Ramesh Errabolu	e5a242acf5	Update Copy requests involving all pools i.e. options -a or -A Change-Id: I0c8d8fbb39f43cd6a1f84ae6ae32337fa9b1f5e2 [ROCm/ROCR-Runtime commit: `703b1466c1`]	2017-10-10 13:01:46 -04:00

... 45 46 47 48 49 ...

2930 Incheckningar