Addresses the below scenarios like parameters mismatch
in memcopy node, difference in the count of nodes, difference
in the dependencies of the nodes.
Change-Id: I31c6516fb27cc1007809f1b50306fdb0c2356ccc
hip headers such as hip_math_constants.h, library_types.h,
hip_common.h, channel_descriptor.h, device_functions.h, hip_complex.h,
hip_texture_types.h, math_functions.h, surface_types.h
are added in HIPRTC
Change-Id: I4a4c198449ceb609c3ff55e00b43056c1f085431
Add trap handler code into runtime and compile/load during
device initialization. The current interface for trap handler in
PAL is obsolete and the new interface will be provided later.
Change-Id: I1fa702c5d1f2e6731f781369c980d546cf422328
Three for loops iterate over all graph nodes for UpdateStream, FillCommands and
EnqueueCommands has performance drop for large graphs.
Change-Id: I077accf3a4680d5d944b73200fd6498a7a48f25c
When kernel function expects no parameters no error should be returned
if both kernelParams and extra arguments are set to null.
Change-Id: I5941bcc400b6fb380e623bdae0233ae3e4f73815
This reverts commit f3dc04a50d.
Reason for revert: There are currently some outstanding issues with the COMPILE_SOURCE_WITH_DEVICE_LIBS Comgr action (https://ontrack-internal.amd.com/browse/SWDEV-386072). Once these LLVM issues have been resovled, we can safely re-apply this patch
Change-Id: I8501967af8496ea50d6e4a97399e45db51bbed1e
- Acquire and Release scopes for AQL need to be balanced. These were
missing on the AMD_OPT_FLUSH=0 path.
Change-Id: Ibf4132eb96800f155d7b664359c790d68a353e60
hipMemcpyArrayToArray, hipMemcpyFromArrayAsync, and hipMemcpyToArrayAsync
are deprecated in cuda and are missing the headers in hip_runtime_api.h.
Removed their implementation from hip_memory.cpp.
Change-Id: I9720aec6241515c56c66b7e90a37b2ed53347eb2
This is related to SWDEV-410182, but it's not enough to fix it.
Functions from device-libs are precompiled into llvm-ir in a "target agnostic" way
(in reality, it's not 100% target agnostic, which brings us many headaches).
When linking builtins (like device-libs) from the command line, we use the flag
-mlink-builtin-bitcode. The difference between regular linking of bitcode and
this flag is that the later propagates target-specific attributes. If this
attributes are not propagated, we can end up with incosistent target attributes.
Comgr provides the action AMD_COMGR_ACTION_COMPILE_SOURCE_WITH_DEVICE_LIBS_TO_BC
for this exact reason. The old action is currently deprecated and this one should
be used.
Change-Id: I518415214debdf4fedf0b1d81456d6e9fb8a3d19
Use large signal pool if profiler is connected or profiling forced
enabled. This is needed to mitigate signal creation overhead when
profiling as signals are attached to every packet and deeper batch may
show overhead of signal allocation.
Change-Id: I8034b8a20b55328b87d593bf044f59672f9653e8
This PR fixes warning seen on HIP headers due to -Wignored-attributes.
The warning is attributes for a variable must precede definition.
This also fixes a potential issue with C-style header where templates
were declared outside __cplusplus.
Change-Id: I78bfd391717c7891afb442ef79812ea630c0ff49
This reverts commit 44a3935cda.
Implement the right way to make ExternalSemaphores be signalled
only after prior works on the stream have been finished.
Change-Id: I9d5974e05d5f229170b928db4566c14e40e3cbaa
- Program unique AQL index for debugger. The logic manages AQL array of packets per HW queue.
- Provide debug state to PAL
Change-Id: I38fa1f5435fa711fd1d44dc391f2e61eb2a25efa
We currently have __half2 made up of unsigned short instead of __half.
This prevents users to do operation seamlessly when they want to operate on individual components.
Change-Id: I856917db905f68055fdf484f526707fe8ea3117d
This patch did not consider the dicussions in SWDEV-270908
> "we found that in GeekBench5, forcing Wave64 instead of the default
> Wave32 compute policy yields big gains in every subtest except one"
This reverts commit d6dc82b220.
Change-Id: Ice1728585b9d1b2c1b36a06cfa0b8c47cb2bfa49
Seems the windows linker error seen with https://gerrit-git.amd.com/c/compute/ec/clr/+/886078
also exists for linux, but only appears with dynamic linker. This fixes both usecases.
Change-Id: I1bb6df151eb1e09f75af4d461ead28bcbaa5f236