İşleme Grafiği

12410 İşleme

Yazar SHA1 Mesaj Tarih
Rahul Manocha e46733affe SWDEV-439234 - Fix for Segfault in ValidateMemAccess
Change-Id: I251d277eb5af16ba5c0de85ffd142a5f64fa469d


[ROCm/clr commit: 07261002b1]
2024-09-18 10:52:32 -04:00
Daniel Livingston 7c0ff614a2 SWDEV-77148 - Add UberTrace support to PAL device
This PR adds UberTrace-based tracing support to ROCclr's PAL device class.
Legacy RGP-based tracing is still available and is the default.
If UberTrace support is enabled tool-side, this new code path will activate.

Change-Id: I268b2dcef70e850a50e2caef8355f38bf51d4641


[ROCm/clr commit: e550032d25]
2024-09-17 16:06:37 -04:00
Satyanvesh Dittakavi 1ef9123b54 SWDEV-478776 - Fix segfault with streamsync using hipStreamLegacy
Change-Id: Ifb412d0bcfa33bc1130b47b757ee276ca9bc1c3a


[ROCm/clr commit: 8ee065c5bd]
2024-09-17 15:02:18 -04:00
Jatin Chaudhary 274fd2628f SWDEV-478065 - Embed host thread in shared_ptr
This shows up in some valgrind runs. Make sure the resources are
released.

Change-Id: I34c25c00370a221585895655744831215136d5f4


[ROCm/clr commit: 4b03017e8a]
2024-09-17 09:53:51 -04:00
Jaydeep Patel 7b2cd9111c SWDEV-478049 - Clear packets list as it is being added back later during submitKernelInternal while setting params for graph node.
Change-Id: I7451ffda93d94eeda5e1be05bb87558ae86d2a19


[ROCm/clr commit: 2494992695]
2024-09-16 23:10:32 -04:00
Ioannis Assiouras b5a8d775d6 SWDEV-476929 - Introduce an activeQueues set
The new set tracks only the queues that have a command
submitted to them. This allows for fast iteration
in waitActiveStreams.

Change-Id: I2c832eefa01280d9a87a5f57874d36d2e9441de7


[ROCm/clr commit: bcc545e6b8]
2024-09-16 15:53:49 -04:00
kjayapra-amd 196e2a53bc SWDEV-484188 - Change few std::array style globals to C style to avoid optimization on Static Libs.
Change-Id: Iab6d3c040b8d088013daa08093898be99dd3a971


[ROCm/clr commit: d81c5d3d7f]
2024-09-16 09:46:56 -04:00
kjayapra-amd e05182981a SWDEV-484188 - Moving std::maps into struct const and into amd::Kernel class.
Change-Id: Ie4d5a64511412fdb498b045aaffb52c3a1286de6


[ROCm/clr commit: 4ecd77df5e]
2024-09-15 09:14:51 -04:00
Ranjith Ramakrishnan d7d12c0b49 SWDEV-437189 - Provide option to enable/disable CPACK_SET_DESTDIR
The variable is already set as cache, so that user can override.
But the hard coded setting is preventing override. Removed the same

Change-Id: I2aecc18ce4f1d1b523ba267ef1c8ef4ea1168d9c


[ROCm/clr commit: 4d0b815d06]
2024-09-13 15:49:53 -04:00
Rahul Manocha b70968d769 SWDEV-480536 - Disable cpu wait in device synchronize
1) currently cpu wait is set to true, which makes the host wait for last
command in queue to finish even if the kernel execution has already
finished causing delay in device sync call.
2) device sync only needs to await completion when hw event
is not ready.


Change-Id: I91e3e89d39a1193ae06abac822cea8ae651493a5


[ROCm/clr commit: eb1089593e]
2024-09-13 15:31:32 -04:00
Ajay d7f4f778b3 SWDEV-465215 - hipFuncSetAttribute hipFuncGetAttributes fixes
Change-Id: I2151e4470d63918ff6b809a8fdeaae5bea5cc899


[ROCm/clr commit: c9955a1cea]
2024-09-13 14:05:25 -04:00
Saleel Kudchadker 343bdf3187 SWDEV-478624 - Use readback workaround to ensure kernel arg coherence
Use env var DEBUG_CLR_KERNARG_HDP_FLUSH_WA=1 to fall back to HDP flush
workaround. The default is 0

Change-Id: I7bdb9be61da60c30d15ac9991b7cd27351e1831c


[ROCm/clr commit: 9de6d4d46c]
2024-09-11 14:53:15 -04:00
Ajay 2a79ff2bca SWDEV-471863 - avoid copy of invisibleHeap
Change-Id: Ieb0aa22ac6d0d01cb9ca7fbf1305df03a1ab3cdf


[ROCm/clr commit: 5a810f789a]
2024-09-11 13:24:31 -04:00
Jaydeep Patel 7fa7a7cae5 SWDEV-475938 - Update dynamic stack in submit kernel internal.
Change-Id: I816bf9cfe8aaac5486ff3b719dbdc4f4d6134e01


[ROCm/clr commit: 9c90bc43a5]
2024-09-11 00:59:45 -04:00
Chong Li 4979c2f206 SWDEV-478929 - Benchmark ReallyQuickPureX Failed
Ensure the member function Alloc() and Free() of command_pool_ will not be
accessed after command_pool_ be destructed.

Signed-off-by: Chong Li <chongli2@amd.com>
Change-Id: Ic2d36423302518a030bd61fa399290ebe2ed8194


[ROCm/clr commit: e6a5c81221]
2024-09-10 22:08:18 -04:00
Saleel Kudchadker a3dc515316 SWDEV-301667 - Improve kernel logging
Change-Id: I4b2b1950e3ab7124fd41af9a92a677c48d6da5eb


[ROCm/clr commit: abc80fcc2f]
2024-09-10 13:43:58 -04:00
Saleel Kudchadker 95c84bef10 SWDEV-481974 - Clear dependent signal bit for barrier value
Change-Id: I3ffda051fa8538970fbb1964beb1f538fce0782c


[ROCm/clr commit: 62a7fed90d]
2024-09-10 13:43:04 -04:00
Ioannis Assiouras 8f3e41932c SWDEV-482553 - Removed setting of BUILD_SHARED_LIBS from hip-config.cmake
Change-Id: I84eb33939d47dde1dd389741c431ee0e5955973b


[ROCm/clr commit: 0b8bc6682f]
2024-09-09 13:27:53 +01:00
kjayapra-amd eecbcddaf3 SWDEV-439234 - Access check before memcpy and kernel operations.
Change-Id: I7057125c03460db205409e19980145298c190fe2


[ROCm/clr commit: 6211037f63]
2024-09-06 14:30:00 -04:00
Julia Jiang 29e9bed35d SWDEV481762 - Updated definition of 'DEPRECATED' in header file
Change-Id: I88986b8e1815f3d816595f3eb2da8a6c1c1c2993
Jenifer helped make a combined PSDB build, together with the change in hip repos
https://gerrit-git.amd.com/c/compute/ec/hip/+/1114046
Combined PSDB verification passed.
http://rocm-ci.amd.com/job/compute-psdb-staging-hip/17293/


[ROCm/clr commit: bb03ef11a3]
2024-09-05 15:41:04 -04:00
victzhan 11632a954a SWDEV-477218 - Implement hipDeviceGetTexture1DLinearMaxWidth
Change-Id: I8103f710abeb869f5f84be61c57a30b24356def6


[ROCm/clr commit: 8be00b6602]
2024-09-05 15:09:38 -04:00
victzhan fde29b7c06 Revert "SWDEV-458943 - make new AMD_MONITOR on"
This reverts commit 47dcfbae6b.

Change-Id: I2a7ddb2d4340224f43749a2ea91a894a8a95b83b


[ROCm/clr commit: 7a01db98e9]
2024-09-05 10:10:50 -04:00
Rahul Manocha b25fd0dc81 SWDEV-479575 - Graph clone root size check
Change-Id: I34dd43ea36ce1e2623198e6ce1179318b9f7e277


[ROCm/clr commit: dbf00966b9]
2024-09-04 11:54:15 -04:00
Marko Arandjelovic 9fab61ebe3 SWDEV-478206 - Fix hipTexRefSetArray
Change-Id: I6bd6ce60163d4f79001fce75e40ef46f1fcb7c3f


[ROCm/clr commit: 224334e1d2]
2024-09-04 03:41:25 -04:00
Jimbo Xie 2036d66b95 SWDEV-403363 - add gfx1152 runtime support
Change-Id: I2f59ddb38a98d9f8edec5d1548232d4d826b7d04
(cherry picked from commit 5e94656f744e315ee7ae1285d3e6dd515f9d66a8)


[ROCm/clr commit: 3bdbc1eaf3]
2024-09-03 17:12:24 -04:00
Rahul Manocha 51c86bc5cb SWDEV-468039 - Define formatting for fp8 ocp data type
Change-Id: Ie3c8bc71b4cefaa20e9e5d80636c2d26a05e91a7


[ROCm/clr commit: 1f333f64c4]
2024-09-03 11:35:48 -04:00
Rahul Manocha c430e1c44d SWDEV-478921 - Destroy Queue created by Coop Launch
Change-Id: I7f31ce05421479ff1de138cae26aafa071e956e2


[ROCm/clr commit: ddbd7039b0]
2024-09-02 02:35:08 -04:00
Rahul Manocha 900f906827 SWDEV-462192 SWDEV-459056 Check if m_streams is empty
1) Since g_devices is not initialized when stream_per_thread constructor
is called on windows, m_streams is empty when hipDeviceReset is called.
2) clear_spt tries to access empty vector causing segfaults in
hipDeviceReset call.
3) on linux ROCCLR_INIT_PRIORITY makes sure that g_devices is initialized
first before tls constructor creates stream_per_thread object.

Change-Id: Ib2ba643d1278d820287ea3b242ed0878d7529165


[ROCm/clr commit: 450eca293b]
2024-09-01 17:17:20 -04:00
Ioannis Assiouras ec3d97ab8d SWDEV-477039 - Use rocm_agent_enumerator to setup targets for static build
The amdgpu-arch tool is not supported for static build.
This commit adds changes to detect the build type during
cmake config and use the rocm_agent_enumerator for static build.

Change-Id: I8a295e01f54075507390ef540f16b28bb20237a9


[ROCm/clr commit: a02888af58]
2024-08-29 10:06:01 -04:00
Marko Arandjelovic 382f435359 SWDEV-478520 - Prevent segfaults in hipTexRefSetAddress
Change-Id: I9a57ccb81c574e35e7ebf6d71512f9249413bc3e


[ROCm/clr commit: ddc5744c19]
2024-08-29 05:05:37 -04:00
Anusha GodavarthySurya fc014587f8 SWDEV-477324 - Graph Capture memcpy D2D
Change-Id: Ifaa4d78854c03b3150233142df187c9bbf731cab


[ROCm/clr commit: e98179d924]
2024-08-28 23:36:51 -04:00
Julia Jiang 71d97112cc SWDEV-476623 - correct the format on the fix for clCopyImage
Change-Id: I3a3fb2eaa338ff4e298a43e583fcf94ec7cabdf6


[ROCm/clr commit: 417d3279f9]
2024-08-28 16:16:24 -04:00
Julia Jiang 049a21f3da SWDEV-476623 - Fix test failures for clCopyImage
Change-Id: I971c5be98304bdbef0feec73e15ebd61a131b12f


[ROCm/clr commit: c3c41dae0d]
2024-08-27 11:43:12 -04:00
Tao Sang eaa7fd41cf SWDEV-474989 - Fix issues of texture tests
Change-Id: Ie1d874742b804f82ceda68864fa54f5d59c092b8


[ROCm/clr commit: 4b211f7272]
2024-08-27 11:29:43 -04:00
kjayapra-amd afd72f9ad0 SWDEV-478099 - Fix multiple mapping case on PAL/Windows backend.
Change-Id: Id1fe7939fbf90649cda1848890b3b4ca9a1fcd00


[ROCm/clr commit: 2a9cb89228]
2024-08-27 11:19:39 -04:00
Ioannis Assiouras a00f071579 SWDEV-470372 - Added hipExtHostAlloc API
This change adds a new HIP API `hipExtHostAlloc` which preserves
the functionality of `hipHostMalloc`.

Change-Id: I13504c6fc13465ddd7aed329795bb4f2fef1baff


[ROCm/clr commit: 2c84211b58]
2024-08-27 08:26:03 -04:00
Jatin Chaudhary 5e3fe9bd1f SWDEV-480489 - fix unsafeAtomicAdd
Integration into pytorch pointed out some issues, value narrowing, to
fix this we are now using unions. Also removed check for -munsafe*
compiler flag. The check is now just on builtin detection.

Change-Id: I49364503fa429bd862952f9b29879072afa6d553


[ROCm/clr commit: bb52d9ed62]
2024-08-27 06:29:11 -04:00
Vladana Stojiljkovic e8b3e9e5b6 SWDEV-478207 - Return hipSuccess on the end of hipTexRefGetMaxAnisotropy
Change-Id: I0c4d6d13a178af8449853c87e62a1868eb17f87d


[ROCm/clr commit: f5e6e27fe1]
2024-08-27 05:30:36 -04:00
ksankisa 3bcd901f06 [SWDEV-469495] Compile blit kernels with -fsanitize=address when asan is enabled.
Change-Id: I96e1abef43317cd58329c4a159f807878bc48cf4


[ROCm/clr commit: e76bf653fb]
2024-08-27 01:27:31 -04:00
Sameer Sahasrabuddhe aacf75f480 SWDEV-480725: missing __ockl_wfall __ockl_wfany in amd_hip_bf16.h
Change-Id: Iff4aeec411bfeaf4cc187c515e2da3d5898f89cb


[ROCm/clr commit: 6df2da65cd]
2024-08-25 22:49:14 -04:00
kjayapra-amd f370cced08 SWDEV-479620 - Change argument type to size_t from uint64_t in nonTemporalMemcpy function.
Change-Id: I31f8a2b00685789b027d78be40a9f82c235f51b9


[ROCm/clr commit: 00eb038eec]
2024-08-24 07:42:37 -04:00
Ajay c774a3470d SWDEV-478881 - Fix log AMD_LOG file corruption
hiprtc and hip APIs use the same file.
Append to file instead of start of file

Change-Id: I2703f9bb67f0c51b557a058daab129679a0b5dd9


[ROCm/clr commit: e07172ff57]
2024-08-23 11:19:48 -04:00
kjayapra-amd 2af78c954b SWDEV-478097 - Check for parents size in case of VA Mem object.
Change-Id: Icfdeabeb178c0dcc8c3a4bc48eec40067985794e


[ROCm/clr commit: d7b097c994]
2024-08-22 14:18:51 -04:00
Julia Jiang eac092348a SWDEV-479940 - Update the changelog for 6.3
Change-Id: I2b465d297466b9c4884e30649bd2ea12a4c4229c


[ROCm/clr commit: 6576be5602]
2024-08-22 11:28:46 -04:00
Shane Xiao 06912065d9 [SWDEV-479204] Fix the hipGraph AQL package fill issue
This patch fixes this potential issue that filling AQL header before
filling the AQL body. The hsa spec specifies "Packet processors may
process AQL packets after the packet format field is updated, but
before the doorbell is signaled."
However, the hipGraph AQL package with valid header will be filled
before fill the body, which may have the potential issue that CP
receive invalid AQL body.

Change-Id: I84af798c19ee2b8805ba19732b0eabdea2958a96


[ROCm/clr commit: 3959b5be1e]
2024-08-21 21:49:11 -04:00
Rahul Manocha 1b14058283 SWDEV-474617 SWDEV-464679 - Fix segfault in palvirtual due to peer memory access
Change-Id: Ib8b641712d78acf8bc073ca5705dea97af6f944a


[ROCm/clr commit: 432bdd7bf2]
2024-08-21 11:34:15 -04:00
Sourabh Betigeri d1d6c448c9 SWDEV-462192 SWDEV-459056 - Fixes corruption
SPT is destroyed with hipDeviceReset(). If a
stream is created right after reset, the same
object id could be reused. Later SPT destructor
incorrectly verifies that the stream is valid
referring to the reused object id causing the
corruption.

Change-Id: I3b1f7ffdf8bab874dca7b8fde22318162997b8f6


[ROCm/clr commit: f6a68b3c2e]
2024-08-21 11:33:44 -04:00
Ioannis Assiouras b5acdd6fdc SWDEV-470612 - Added fixes in optimized multistream path for graph execution
This change adds fixes in optimized multistream path for childGraph uses cases.

1) For childgraph nodes, rely on runNodes() only to process
   the childgraph and skip calls to createCommand and enqueueCommands.
   This ensures that the start/end markers are enqueued correctly
   with respect to the childGraph commands.
   In addition, the runNodes() for the childgraph should be called after
   the dependency walkthrough to make sure that the subgraph is executed once.

2) Nodes with no outgoing edges should be marked
   as a leafs regardless of which stream they are assigned to.
   This is to ensure that marker dependencies from nodes
   that run on non-zero stream to subgraph leafs that run on zero stream
   are still set up correctly.

Change-Id: I4a5f4f3b0e0d01e515cdcb045b46c2798f291255


[ROCm/clr commit: 464b99373b]
2024-08-21 10:11:24 -04:00
Anusha GodavarthySurya c2a4062392 SWDEV-470612 - Add stream id to DOT print when DEBUG_HIP_GRAPH_DOT_PRINT is enabled
Change-Id: Iec3630ba6fb2206925653ea939770bb9820d7c52


[ROCm/clr commit: 19bf971134]
2024-08-21 00:37:41 -04:00
taosang2 785d6e7d01 SWDEV-475144 - Fix random language string
Fix random language string that leads to compiling failure
of trap handler and TDR of hipMemset() on VM in release
mode of hip-rt

Change-Id: Ie1d874742b804f62ceda68064fa54f5d39c092b8


[ROCm/clr commit: 857d0d60b9]
2024-08-20 17:42:31 -04:00