Reapply amdgpu-windows-interop revert. (#1893)

## Overview and rationale

This reverts https://github.com/ROCm/rocm-systems/pull/1886, which...
* Re-applies https://github.com/ROCm/rocm-systems/pull/1866
* Reverts https://github.com/ROCm/rocm-systems/pull/1728

(So it restores the [`amdgpu-windows-interop/`](https://github.com/ROCm/rocm-systems/tree/develop/shared/amdgpu-windows-interop) folder back to the state from a few weeks ago)

The rationale for this change is at https://github.com/ROCm/rocm-systems/pull/1866:
> Last PAL update broke applications on gfx12 Windows.

## Cross-repository change details

That PR failed to build but was merged with this explanation:

> TheRock CI Windows build fails as expected with this revert.
> 
> References to these PAL members need to be stripped out in a patch on TheRock.
> 
> ```
> 11.3	C:\home\runner\_work\rocm-systems\rocm-systems\projects\clr\rocclr\device\pal\palubercapturemgr.cpp(152): error C2039: 'RegisterTraceStateChangeCallback': is not a member of 'GpuUtil::TraceSession'
> 11.4	C:\home\runner\_work\rocm-systems\rocm-systems\shared\amdgpu-windows-interop\pal\inc\gpuUtil\palTraceSession.h(372): note: see declaration of 'GpuUtil::TraceSession'
> 11.4	C:\home\runner\_work\rocm-systems\rocm-systems\projects\clr\rocclr\device\pal\palubercapturemgr.cpp(195): error C2039: 'UnregisterTraceStateChangeCallback': is not a member of 'GpuUtil::TraceSession'
> 11.4	C:\home\runner\_work\rocm-systems\rocm-systems\shared\amdgpu-windows-interop\pal\inc\gpuUtil\palTraceSession.h(372): note: see declaration of 'GpuUtil::TraceSession'
> ```

The patch in TheRock was updated in https://github.com/ROCm/TheRock/pull/2154. This rolls forward by updating the ref for TheRock.

That original PR could have been sequenced differently to avoid a build break - perhaps by
* Pointing to a branch in TheRock with the patch rebased
* Deleting the patch in the workflows here but holding a local copy of the path to be applied in workflows
* Landing the patch as a normal commit instead of carrying it at all

## Test plan

1. Watch TheRock CI here (https://github.com/ROCm/rocm-systems/actions/runs/19447202693/job/55644411119?pr=1893)
2. Build locally:
    
    ```bash
    # In rocm-systems
    git am --whitespace=nowarn D:\projects\TheRock\patches\amd-mainline\rocm-systems\0001-Revert-SWDEV-543498-Some-compute-Ubertrace-profiles-.patch
    git am --whitespace=nowarn D:\projects\TheRock\patches\amd-mainline\rocm-systems\0003-Use-is_versioned-true-consistently-in-both-Comgr-Loa.patch
    git am --whitespace=nowarn D:\projects\TheRock\patches\amd-mainline\rocm-systems\0006-Explicitly-load-libamdhip64.so.7.patch
    # Note: the build fails with the observed errors if patch 0001 is not applied!
    
    # In TheRock
    cmake -DCMAKE_BUILD_TYPE=Release \
      -DCMAKE_C_COMPILER=cl.exe -DCMAKE_CXX_COMPILER=cl.exe \
      -DCMAKE_C_COMPILER_LAUNCHER=ccache -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \
      -DPython3_EXECUTABLE=d:/projects/TheRock/.venv/Scripts/python \
      -DTHEROCK_ROCM_SYSTEMS_SOURCE_DIR=d:/projects/TheRock/../rocm-systems \  # IMPORTANT
      -DTHEROCK_AMDGPU_FAMILIES=gfx110X-all \
      -DBUILD_TESTING=ON \
      -DTHEROCK_ENABLE_ALL=ON \
      -Damd-llvm_BUILD_TYPE=RelWithDebInfo \
      -S D:/projects/TheRock \
      -B D:/projects/TheRock/build \
      -G Ninja
    
    cmake --build D:/projects/TheRock/build --target hip-clr
    # [build] Build finished with exit code 0
    cmake --build D:/projects/TheRock/build --target ocl-clr+dist
    # [build] Build finished with exit code 0
    ```
This commit is contained in:
Scott Todd
2025-11-18 07:17:06 -08:00
committed by GitHub
parent 44a32e23ac
commit fa772be675
139 changed files with 44141 additions and 44363 deletions
+2 -3
View File
@@ -38,7 +38,7 @@ jobs:
with:
repository: "ROCm/TheRock"
path: "TheRock"
ref: 6fab5d65a552483bcfa1f6ccaaabf699c8188c1e # 2025-11-06 commit
ref: eb8f187ff47eb6af9cd5aaa0b8d9a04b06b12796 # 2025-11-15 commit
- name: Install python deps
run: |
@@ -66,7 +66,6 @@ jobs:
run: |
# Remove patches here if they cannot be applied cleanly, and they have not been deleted from TheRock repo
# rm ./TheRock/patches/amd-mainline/rocm-systems/*.patch
rm ./TheRock/patches/amd-mainline/rocm-systems/0008-Find-bundled-libelf.patch
./TheRock/build_tools/fetch_sources.py --jobs 12 --no-include-rocm-systems --no-include-rocm-libraries --no-include-ml-frameworks
@@ -110,7 +109,7 @@ jobs:
uses: aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722 # v4.1.0
with:
aws-region: us-east-2
role-to-assume: arn:aws:iam::692859939525:role/therock-artifacts-external
role-to-assume: arn:aws:iam::692859939525:role/therock-ci-external
- name: Post Build Upload
if: always()
+2 -7
View File
@@ -39,7 +39,7 @@ jobs:
with:
repository: "ROCm/TheRock"
path: "TheRock"
ref: 6fab5d65a552483bcfa1f6ccaaabf699c8188c1e # 2025-11-06 commit
ref: eb8f187ff47eb6af9cd5aaa0b8d9a04b06b12796 # 2025-11-15 commit
- name: Set up Python
uses: actions/setup-python@e797f83bcb11b83ae66e0230d6156d7c80228e7c # v6.0.0
@@ -54,7 +54,6 @@ jobs:
run: |
# Remove patches here if they cannot be applied cleanly, and they have not been deleted from TheRock repo
# rm ./TheRock/patches/amd-mainline/rocm-systems/*.patch
rm ./TheRock/patches/amd-mainline/rocm-systems/0008-Find-bundled-libelf.patch
git -c user.name="therockbot" -c "user.email=therockbot@amd.com" am --whitespace=nowarn ./TheRock/patches/amd-mainline/rocm-systems/*.patch
- name: Install requirements
@@ -72,10 +71,6 @@ jobs:
with:
version: '3.62.0'
- uses: iterative/setup-dvc@4bdfd2b0f6f1ad7e08afadb03b1a895c352a5239 # v2.0.0
with:
version: '3.62.0'
# After other installs, so MSVC get priority in the PATH.
- name: Configure MSVC
uses: ilammy/msvc-dev-cmd@0b201ec74fa43914dc39ae48a89fd1d8cb592756 # v1.13.0
@@ -138,7 +133,7 @@ jobs:
uses: aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722 # v4.1.0
with:
aws-region: us-east-2
role-to-assume: arn:aws:iam::692859939525:role/therock-artifacts-external
role-to-assume: arn:aws:iam::692859939525:role/therock-ci-external
special-characters-workaround: true
- name: Post Build Upload
+1 -1
View File
@@ -92,7 +92,7 @@ jobs:
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
repository: "ROCm/TheRock"
ref: 6fab5d65a552483bcfa1f6ccaaabf699c8188c1e # 2025-11-06 commit
ref: eb8f187ff47eb6af9cd5aaa0b8d9a04b06b12796 # 2025-11-15 commit
- name: Run setup test environment workflow
uses: './.github/actions/setup_test_environment'
@@ -298,21 +298,12 @@ enum ImageLayoutUsageFlags : uint32
/// display engine.
LayoutUncompressed = 0x00001000, ///< Metadata fully decompressed/expanded layout
LayoutSampleRate = 0x00002000, ///< CmdBindSampleRateImage() source.
LayoutVideoEncodeRead = 0x00004000, ///< Video encoder input image layout, output is buffer so no layout.
LayoutVideoDecodeWrite = 0x00008000, ///< Video decoder output image layout, input is buffer so no layout.
LayoutAllUsages = 0x0000FFFF,
LayoutAllUsages = 0x00003FFF
};
/// Bitmask values that can be ORed together to specify all potential engines an image might be used on. Such a
/// mask should be specified in the engines field of ImageLayout.
///
/// Generally speaking, image transition inside the all video queues doesn't require barrier including stall, cache
/// sync and layout transition. For transition across queues, we rely inter-queue sync to guarantee the stall
/// and cache sync. However, it's possible the layout transition is incompatible and we need handle it. Clients can
/// call @ref IImage::IsLayoutTransitionCompatible() to check if the transition is compatible or not; if not,
/// must issue a barrier to do the layout transition. Note that Layout transitions must always be executed on Universal
/// or Compute queues; and DMA queue only supports metadata initialization transition.
///
/// If the client API is unable to determine which engines might be used, it should specify all possible engines
/// corresponding to the usage flags.
enum ImageLayoutEngineFlags : uint32
@@ -370,35 +361,25 @@ enum CacheCoherencyUsageFlags : uint32
/// Bitmask values for the flags parameter of ICmdBuffer::CmdClearColorImage().
enum ClearColorImageFlags : uint32
{
ColorClearAutoSync = 0x01, ///< PAL will automatically insert required barrier synchronization before
ColorClearAutoSync = 0x00000001, ///< PAL will automatically insert required barrier synchronization before
/// and after the clear assuming all subresources to be cleared are currently
/// ready for rendering as a color target (as is required by API convention in
/// DX12). Allows reduced sync costs in some situations since PAL knows
/// the details of how the clear will be performed.
ColorClearForceSlow = 0x02, ///< Force these to use slow clears.
ColorClearSkipIfSlow = 0x04, ///< Only issue the clear if it is a fast clear.
ColorClearInitMetaData = 0x08, ///< PAL will make sure initialize all metadata (including internal metadata state
/// data) for this image to be cleared. This is typically used for placed resource
/// initialization (as required by API convention in DX12); should only be used
/// when this is a full box clear.
ColorClearAllFlags = 0x0F ///< Clients should NOT use it, for internal static_assert purpose only.
ColorClearForceSlow = 0x00000002, ///< Force these to use slow clears.
ColorClearSkipIfSlow = 0x00000004, ///< Only issue the clear if it is a fast clear.
ColorClearAllFlags = 0x00000007 ///< Clients should NOT use it, for internal static_assert purpose only.
};
/// Bitmask values for the flags parameter of ICmdBuffer::CmdClearDepthStencil().
enum ClearDepthStencilFlags : uint32
{
DsClearAutoSync = 0x01, ///< PAL will automatically insert required barrier synchronization before
DsClearAutoSync = 0x00000001, ///< PAL will automatically insert required barrier synchronization before
/// and after the clear assuming all subresources to be cleared are currently
/// ready for rendering as a depth/stencil target (as is required by API convention
/// in DX12). Allows reduced sync costs in some situations since PAL knows the
/// details of how the clear will be performed.
DsClearInitMetaData = 0x02, ///< PAL will make sure initialize all metadata (including internal metadata state
/// data) for this image to be cleared. This is typically used for placed resource
/// initialization (as is required by API convention in DX12); should only be used
/// when this is a full box clear. Note that if clients call @ref
/// CmdClearDepthStencil() with this flag, MUST call @ref CmdUpdateHiSPretests()
/// after clear call otherwise HiSPretests will be overridden to initialized state.
DsClearAllFlags = 0x03 ///< Clients should NOT use it, for internal static_assert purpose only.
DsClearAllFlags = 0x00000001 ///< Clients should NOT use it, for internal static_assert purpose only.
};
/// Bitmask values for the flags parameter of ICmdBuffer::CmdResolveImage().
@@ -559,12 +540,7 @@ union CmdBufferBuildFlags
/// non-TMZ memory, the results are undefined. Only valid for graphics and compute.
uint32 enableTmz : 1;
/// @internal
/// Build this command buffer in system memory
///
/// @warning This is an internal flag and its existence, its signature and its semantics are not guaranteed
/// across different PAL versions.
uint32 buildInSysMem : 1;
uint32 placeholder3 : 1;
/// If set, internal operations such as blits, copies, etc. will not affect active Query results.
/// Otherwise they may affect the results.
@@ -1309,35 +1285,16 @@ extern const ColorSpaceConversionTable DefaultCscTableYuvToRgb;
/// to perform a RGB to YUV color space conversion. Represents the BT.601 standard (standard-definition TV).
extern const ColorSpaceConversionTable DefaultCscTableRgbToYuv;
/// Specifies flags controlling GPU copy behavior in @ref CmdCopyImage. Format related flags are ignored by DMA queues.
enum CopyImageControlFlags : uint32
{
CopyImageFormatConversion = 0x1, ///< Requests that the copy convert between two compatible formats. This is
/// ignored unless both formats support @ref FormatFeatureFormatConversion.
CopyImageRawSwizzle = 0x2, ///< If possible, raw copies will swizzle from the source channel format into the
/// destination channel format (e.g., RGBA to BGRA).
CopyImageEnableScissorTest = 0x4, ///< If set, do scissor test using the specified scissor rectangle.
CopyImageInitDstMetadata = 0x8, ///< Requests copy initializes dst image's metadata; requires full box copy.
CopyImageControlAllFlags = 0xF ///< Clients should NOT use it, for internal static_assert purpose only.
};
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 955
/// Specifies flags controlling GPU copy behavior. Format related flags are ignored by DMA queues.
enum CopyControlFlags : uint32
{
CopyFormatConversion = CopyImageFormatConversion,
CopyRawSwizzle = CopyImageRawSwizzle,
CopyEnableScissorTest = CopyImageEnableScissorTest,
CopyFormatConversion = 0x1, ///< Requests that the copy convert between two compatible formats. This is ignored
/// unless both formats support @ref FormatFeatureFormatConversion.
CopyRawSwizzle = 0x2, ///< If possible, raw copies will swizzle from the source channel format into the
/// destination channel format (e.g., RGBA to BGRA).
CopyEnableScissorTest = 0x4, ///< If set, do scissor test using the specified scissor rectangle.
CopyControlAllFlags = 0x7 ///< Clients should NOT use it, for internal static_assert purpose only.
};
#endif
/// Specifies flags controlling GPU copy behavior in @ref CmdCopyMemoryToImage.
/// Format related flags are ignored by DMA queues.
enum CopyMemoryToImageControlFlags : uint32
{
CopyMemoryToImageInitDstMetadata = 0x1, ///< Requests copy initializes dst image's metadata; requires full box copy.
CopyMemoryToImageControlAllFlags = 0x1 ///< Clients should NOT use it, for internal static_assert purpose only.
};
/// Specifies parameters for a resolve of one region in an MSAA source image to a region of the same size in a single
/// sample destination image. Used as an input to ICmdBuffer::CmdResolveImage().
@@ -1752,19 +1709,12 @@ struct DispatchAqlParams
};
/// This structure holds the parameters used during kernel dispatch.
struct DispatchAqlFeedback
{
uint32 tmpRingSize; ///< Content of the compute_tmpring_size register.
};
/// @internal Function pointer type definition for issuing AQL dispatches.
///
/// @see ICmdBuffer::CmdDispatchAql().
typedef void (PAL_STDCALL *CmdDispatchAqlFunc)(
ICmdBuffer* pCmdBuffer,
const DispatchAqlParams& dispatchInfo,
DispatchAqlFeedback* pFeedback);
const DispatchAqlParams& dispatchInfo);
/// Specifies input assembler state for draws.
/// @see ICmdBuffer::CmdSetInputAssemblyState
@@ -1978,13 +1928,6 @@ struct Viewport
PointOrigin origin; ///< Origin of the viewport relative to NDC. UpperLeft or LowerLeft.
};
/// Specifies the range for user-defined depth clamp.
struct DepthClamp
{
float minDepth; ///< Minimum depth value after viewport transform.
float maxDepth; ///< Maximum depth value after viewport transform.
};
/// Specifies the viewport transform parameters for setting a single viewport.
/// @see ICmdBuffer::CmdSetViewport
struct ViewportParams
@@ -1998,7 +1941,6 @@ struct ViewportParams
float horzClipRatio; ///< The ratio between guardband clip rect width and viewport width.
float vertClipRatio; ///< The ratio between guardband clip rect height and viewport height.
DepthRange depthRange; ///< Specifies the target range of Z values
DepthClamp userDepthClamp; ///< Specifies the clamp range of Z values for DepthClampMode::UserDefined.
// Define viewports array at the end of the structure as it is common to only access the first N from the CPU.
Viewport viewports[MaxViewports]; ///< Array of desciptors for each viewport.
};
@@ -2147,9 +2089,7 @@ struct CmdBufInfo
uint32 captureCamera : 1; ///< Has Direct Capture camera matrix capture
uint32 hudLessImagePropChanged : 1; ///< Indicates whether HUD less image properties changed
uint32 captureHudLessImage : 1; ///< Has Direct Capture HUD less image capture
uint32 llmDecodeStart : 1; ///< Has LLM decode Start Enabled in the CmdBufInfo packet
uint32 llmDecodeStop : 1; ///< Has LLM decode Stop Enabled in the CmdBufInfo packet
uint32 reserved : 1; ///< Reserved for future usage.
uint32 reserved : 3; ///< Reserved for future usage.
};
uint32 u32All; ///< Flags packed as uint32.
};
@@ -3352,27 +3292,12 @@ public:
/// @param [in] regionCount Number of regions to copy; size of the pRegions array.
/// @param [in] pRegions Array of copy regions, each entry specifying a source offset, a destination
/// subresource, destination x/y/z offset, and copy size in the x/y/z dimensions.
/// @param [in] flags A mask of ORed @ref CopyMemoryToImageControlFlags that can be used to control copy
/// behavior.
virtual void CmdCopyMemoryToImage(
const IGpuMemory& srcGpuMemory,
const IImage& dstImage,
ImageLayout dstImageLayout,
uint32 regionCount,
const MemoryImageCopyRegion* pRegions,
uint32 flags) = 0;
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 955
void CmdCopyMemoryToImage(
const IGpuMemory& srcGpuMemory,
const IImage& dstImage,
ImageLayout dstImageLayout,
uint32 regionCount,
const MemoryImageCopyRegion* pRegions)
{
CmdCopyMemoryToImage(srcGpuMemory, dstImage, dstImageLayout, regionCount, pRegions, 0);
}
#endif
const MemoryImageCopyRegion* pRegions) = 0;
/// Copies data directly (without format conversion) from an image to a GPU memory object.
///
@@ -4895,24 +4820,13 @@ public:
/// NOTE: Available for compute queues when created with aqlQueue set in the QueueCreateInfo.
///
/// @param [in] dispatchInfo Pointer to kernel dispatch info
/// @param [out] pFeedback Pointer to the structure where information about the
/// dispatch can be stored if != nullptr.
///
/// @note This function is to support OpenCL AQL submissions.
void CmdDispatchAql(
const DispatchAqlParams& dispatchInfo,
DispatchAqlFeedback* pFeedback)
{
m_funcTable.pfnCmdDispatchAql(this, dispatchInfo, pFeedback);
}
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 954
inline void CmdDispatchAql(
const DispatchAqlParams& dispatchInfo)
{
CmdDispatchAql(dispatchInfo, nullptr);
m_funcTable.pfnCmdDispatchAql(this, dispatchInfo);
}
#endif
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 888
/// XDMA was retired starting in gfx10 so this function has no use anymore.
@@ -225,8 +225,7 @@ struct BarrierOperations
uint16 initMaskRam : 1; ///< Memsets uninitialized memory to prepare it for use as
/// CMask/FMask/DCC/HTile.
uint16 updateDccStateMetadata : 1; ///< DCC state metadata was updated.
uint16 retileGfxDccToDisplayDcc : 1; ///< Gfx dcc is retiled to display dcc.
uint16 reserved : 6; ///< Reserved for future use.
uint16 reserved : 7; ///< Reserved for future use.
};
uint16 u16All; ///< Unsigned integer containing all the values.
@@ -583,7 +583,6 @@ struct PalPublicSettings
bool forceLoadObjectFailure;
#endif
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 956
/// Controls the distribution mode for tessellation, which affects how patches are processed by different VGT
/// units. 0: None - No distribution across VGTs (legacy mode). 1: Default - Optimal settings are chosen depending
/// on the gfxip. 2: Patch - Individual patches are distributed to different VGTs. 3: Donut - Patches are split
@@ -591,7 +590,6 @@ struct PalPublicSettings
/// distributed to different VGTs. Falls back to donut mode if HW does not support this mode. 5: Trapezoid only -
/// Distribution turned off if HW does not support this mode.
uint32 distributionTessMode;
#endif
/// Flags that control PAL optimizations to reduce context rolls. 0: Optimization disabled. 1: Pad parameter cache
/// space. Sets VS export count and PS interpolant number to per-command buffer maximum value. Reduces context rolls
@@ -689,12 +687,10 @@ struct PalPublicSettings
/// Disables MCBP on demand. This is a temporary setting until ATOMIC_MEM packet issue with MCBP is resolved.
bool disableCommandBufferPreemption;
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 956
/// Disable the fast clear eliminate skipping optimization. This optimization will conservatively track the usage
/// of clear values to allow the vast majority of images that never clear to a value that isn't TC-compatible to
/// skip the CPU and front-end GPU overhead of issuing a predicated fast clear eliminate BLT.
bool disableSkipFceOptimization;
#endif
/// Sets the minimum BPP of surfaces which will have DCC enabled
uint32 dccBitsPerPixelThreshold;
@@ -748,10 +744,8 @@ struct PalPublicSettings
/// 0x12 - Forced Opaque White
uint32 dccInitialClearKind;
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 956
/// Allows the client to not create internal VrsImage. Pal internal will create a 16M image as vrsImageSize.
bool disableInternalVrsImage;
#endif
/// Allows the client to control binning persistent and context states per bin.
/// A value of 0 tells PAL to pick the number of states per bin.
@@ -1401,17 +1395,9 @@ struct DeviceProperties
/// any compute shader on any queue.
uint32 maxAsyncComputeThreadGroupSize;
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION >= 951
DispatchDims maxComputeThreadGroupCount; ///< Maximum number of thread groups supported for compute pipelines
DispatchDims maxTaskMeshThreadGroupCount; ///< Maximum number of thread groups supported for task+mesh pipelines
DispatchDims maxMeshThreadGroupCount; ///< Maximum number of thread groups supported for mesh-only pipelines
uint32 maxTaskPayloadSize; ///< Maximum size in bytes of payload passed from task shader to mesh shader
#else
uint32 maxComputeThreadGroupCountX; ///< Maximum number of thread groups supported
uint32 maxComputeThreadGroupCountY; ///< Maximum number of thread groups supported
uint32 maxComputeThreadGroupCountZ; ///< Maximum number of thread groups supported
#endif
uint32 maxBufferViewStride; ///< Maximum stride, in bytes, that can be specified in a buffer view.
@@ -1654,10 +1640,8 @@ struct DeviceProperties
uint32 tessFactorBufSizePerSe; ///< Size of GPU's the tessellatio-factor buffer, per shader engine.
uint32 tccSizeInBytes; ///< Size of total L2 TCC cache in bytes.
uint32 tcpSizeInBytes; ///< Size of one L1 TCP cache in bytes. There is one TCP per CU.
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 959
uint32 maxLateAllocVsLimit; ///< Maximum number of VS waves that can be in flight without
/// having param cache and position buffer space.
#endif
uint32 shaderPrefetchBytes; ///< Number of bytes the SQ will prefetch, if any.
uint32 gl1cSizePerSa; ///< Size in bytes of GL1 cache per SA.
uint32 instCacheSizePerCu; ///< Size in bytes of instruction cache per CU/WGP.
@@ -1975,7 +1959,6 @@ struct GpuCompatibilityInfo
uint32 sharedMemory : 1; ///< Devices can share memory objects with. IDevice::OpenSharedMemory().
uint32 sharedSync : 1; ///< Devices can share queue semaphores with
/// IDevice::OpenSharedQueueSemaphore().
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 948
uint32 shareThisGpuScreen : 1; ///< Either device can present to this device. Means that the device
/// indicated by the otherDevice param in
/// IDevice::GetMultiGpuCompatibility() can present to the device the
@@ -1983,9 +1966,6 @@ struct GpuCompatibilityInfo
uint32 shareOtherGpuScreen : 1; ///< Either device can present to the other device. Means that the
/// device IDevice::GetMultiGpuCompatibility() was called on can present
/// to the GPU indicated by the otherGpu param.
#else
uint32 reserved1 : 2;
#endif
uint32 peerEncode : 1; ///< whether encoding HW can access FB memory of remote GPU in chain
uint32 peerDecode : 1; ///< whether decoding HW can access FB memory of remote GPU in chain
uint32 peerTransferProtected : 1; ///< whether protected content can be transferred over P2P
@@ -2705,16 +2685,12 @@ struct GetPrimaryInfoOutput
{
struct
{
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 948
/// MGPU flag: this primary surface supports DVO HW compositing mode.
uint32 dvoHwMode : 1;
/// MGPU flag: this primary surface supports XDMA HW compositing mode.
uint32 xdmaHwMode : 1;
/// MGPU flag: this primary surface supports client doing SW compositing mode.
uint32 swMode : 1;
#else
uint32 reserved1 : 3;
#endif
/// MGPU flag: this primary surface supports freesync.
uint32 isFreeSyncEnabled : 1;
/// Single-GPU flag: gives hint to the client that they should use rotated tiling mode.
@@ -2761,7 +2737,6 @@ struct SetClockModeInput
DeviceClockMode clockMode; ///< Used to specify the clock mode for the device.
};
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 948
/// Specifies primary surface MGPU compositing mode.
enum MgpuMode : uint32
{
@@ -2771,9 +2746,7 @@ enum MgpuMode : uint32
MgpuModeXdma = 3, ///< MGPU XDMA HW compositing mode
MgpuModeCount
};
#endif
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 943
/// Specifies input arguments for IDevice::SetMgpuMode(). A client set a particular MGPU compositing mode and whether
/// frame pacing is enabled for a display.
struct SetMgpuModeInput
@@ -2783,9 +2756,7 @@ struct SetMgpuModeInput
bool isFramePacingEnabled; ///< True if frame pacing enabled. If so, the client creates a timer queue
/// to delay the present, and the delay value is calculated by KMD.
};
#endif
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 948
constexpr uint32 XdmaMaxDevices = 8; ///< Maximum number of Devices for XDMA compositing.
/// Specifies XDMA cache buffer info for each gpu.
@@ -2801,7 +2772,6 @@ struct GetXdmaInfoOutput
{
XdmaBufferInfo xdmaBufferInfo[XdmaMaxDevices]; ///< Output XDMA cache buffer info
};
#endif
/// Specifies flipping status flags on a specific VidPnSource. It's Windows specific.
union FlipStatusFlags
@@ -3621,7 +3591,6 @@ public:
virtual Result SetStaticVmidMode(
bool enable) = 0;
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 943
/// Set up MGPU compositing mode of a display provided by client.
///
/// This function should not be called by clients that rely on PAL for compositor management. Basically, if your
@@ -3630,11 +3599,9 @@ public:
/// @param [in] setMgpuModeInput Set MGPU compositing mode input arguments.
///
/// @returns Success if the MGPU compositing mode were successfully set.
inline Result SetMgpuMode(
const SetMgpuModeInput& setMgpuModeInput) const { return Result::Success; }
#endif
virtual Result SetMgpuMode(
const SetMgpuModeInput& setMgpuModeInput) const = 0;
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 948
/// Get XDMA cache buffer information of each GPU based upon video present source ID provided by client.
///
/// This function should not be called by clients that rely on PAL for compositor management. Basically, if your
@@ -3645,11 +3612,10 @@ public:
/// @param [in,out] pGetXdmaInfoOutput Set XDMA cache buffer info output arguments.
///
/// @returns Success if the XDMA cache buffer information were successfully queried.
inline Result GetXdmaInfo(
virtual Result GetXdmaInfo(
uint32 vidPnSrcId,
const IGpuMemory& gpuMemory,
GetXdmaInfoOutput* pGetXdmaInfoOutput) const { return Result::ErrorUnavailable; }
#endif
GetXdmaInfoOutput* pGetXdmaInfoOutput) const = 0;
/// Polls current fullscreen frame metadata controls on given vidPnSourceId, including extended data.
///
@@ -133,12 +133,8 @@ union GpuMemoryCreateFlags
/// indicating the driver must manage both
/// CPU caches and GPU caches that are not flushed on
/// command buffer boundaries.
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 948
uint64 xdmaBuffer : 1; ///< GPU memory will be used for an XDMA cache buffer for
/// transferring data
#else
uint64 reserved1 : 1; ///< Delete this bit when the MAJOR_VERSION backcompat is removed.
#endif
/// between GPUs in a multi-GPU configuration.
uint64 turboSyncSurface : 1; ///< The memory will be used for TurboSync private swapchain primary.
uint64 typedBuffer : 1; ///< GPU memory will be permanently considered a single
@@ -207,9 +203,7 @@ union GpuMemoryCreateFlags
#endif
uint64 directCaptureSource : 1; ///< Memory will be mapped to DirectCapture resource's KMD-managed
/// private VA.
uint64 videoEncoder : 1; ///< Video encoder output butffer stream.
uint64 videoDecoder : 1; ///< Video decoder input butffer stream.
uint64 reserved : 26; ///< Reserved for future use.
uint64 reserved : 28; ///< Reserved for future use.
};
uint64 u64All; ///< Flags packed as 64-bit uint.
};
@@ -96,7 +96,7 @@ enum class MetadataMode : uint16
{
Default = 0, ///< Default behavior. PAL chooses if metadata should be present or not.
ForceEnabled, ///< Optimization Hint: The client would prefer Metadata if possible. Useful for scenarios where
/// metadata isn't an obvious win and clients can enable based on some heuristic or app-detect.
/// metadata isn't an obvious win and clients can enable based on some hueristic or app-detect.
Disabled, ///< The Image will not contain any compression metadata.
FmaskOnly, ///< The color msaa Image will only contain Cmask/Fmask metadata; this mode is only valid for color
/// msaa Image. On GPUs with GFX12-style distributed compression (see supportDistributedCompression
@@ -186,12 +186,8 @@ union ImageCreateFlags
/// "Uninitialized" state at any time. Otherwise, both planes must be
/// transitioned in the same barrier call. Only meaningful if
/// "perSubresInit" is set.
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 957
uint32 repetitiveResolve : 1; ///< Optimization: Is this image resolved multiple times to an image which
/// is mostly similar to this image?
#else
uint32 reservedRepResolve : 1; ///< Reserved for future use.
#endif
uint32 preferSwizzleEqs : 1; ///< Image prefers valid swizzle equations, but an invalid swizzle
/// equation is also acceptable.
uint32 fixedTileSwizzle : 1; ///< Fix this image's tile swizzle to ImageCreateInfo::tileSwizzle. This
@@ -204,14 +200,10 @@ union ImageCreateFlags
uint32 fullResolveDstOnly : 1; ///< Indicates any ICmdBuffer::CmdResolveImage using this image as a
/// desination will overwrite the entire image (width and height of
/// resolve region is same as width and height of resolve dst).
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 960
uint32 fullCopyDstOnly : 1; ///< Indicates any copy to this image will overwrite the entire image.
/// A perf optimization of using post-copy metadata fixup to replace heavy
/// expand at barrier to LayoutCopyDst. Unsafe to enable it if there is
/// potential partial copy to the image.
#else
uint32 reserved956 : 1;
#endif
uint32 pipSwapChain : 1; ///< Indicates this image is PIP swap-chain. It is only supported on
/// Windows platforms.
uint32 view3dAs2dArray : 1; ///< If set client can view 3D image as 2D with its depth as array slices.
@@ -274,8 +266,7 @@ union ImageUsageFlags
///< for this image.
uint32 vrsRateImage : 1; ///< This image is potentially used with CmdBindSampleRateImage
uint32 videoDecoder : 1; ///< Indicating this Image is video decoder target
uint32 videoEncoder : 1; ///< Indicating this Image is video encoder input.
uint32 reserved : 11; ///< Reserved for future use.
uint32 reserved : 12; ///< Reserved for future use.
};
uint32 u32All; ///< Flags packed as 32-bit uint.
};
@@ -824,12 +815,6 @@ public:
/// @returns the reference to ImageCreateInfo
virtual const ImageMemoryLayout& GetMemoryLayout() const = 0;
/// Reports information on the full range of the image's subresources.
///
/// @returns Reports info on the full range of the image's subresources such as number of mips and planes.
virtual SubresRange GetFullSubresourceRange() const = 0;
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 953
/// Reports information on the full range of the image's subresources.
///
/// @param [out] pRange Reports info on the full range of the image's subresources such as number of mips and
@@ -838,17 +823,7 @@ public:
/// @returns Success if the layout was successfully reported. Otherwise, one of the following error codes may be
/// returned:
/// + ErrorInvalidPointer if pRange is null.
Result GetFullSubresourceRange(SubresRange* pRange) const
{
Result result = Result::ErrorInvalidPointer;
if (pRange != nullptr)
{
*pRange = GetFullSubresourceRange();
result = Result::Success;
}
return result;
}
#endif
virtual Result GetFullSubresourceRange(SubresRange* pRange) const = 0;
/// Reports information on the layout of the specified subresource in memory.
///
@@ -984,27 +959,6 @@ public:
const ImageCopyRegion* pImgRegions,
const uint32 regionCount) const = 0;
/// Check if the provided layout transition is compatible (no layout transition blt necessary) or not (requires
/// layout transition blt).
///
/// @param [in] subresRange Image subresource range.
/// @param [in] oldLayout Specifies the current image layout based on bitmasks of allowed operations and
/// engines up to this point. These masks imply the previous compression state. No
/// usage flags should ever be set in oldLayout.usages that correspond to usages
/// that are not supported by the engine that is performing the transition. The engine
/// type performing the transition must be set in oldLayout.engines.
/// @param [in] newLayout Specifies the upcoming image layout based on bitmasks of allowed operations and
/// engines after this point. These masks imply the upcoming compression state.
/// A difference between oldLayoutUsageMask and newLayoutUsageMask may result in layout
/// transition blt (e.g. decompression) and returns compatible = false.
///
/// @returns True if the layout transition is compatible which indicates no need layout transition blt.
/// False otherwise if layout transition is incompatible and requires layout transition blt.
virtual bool IsLayoutTransitionCompatible(
const SubresRange subresRange,
const ImageLayout oldLayout,
const ImageLayout newLayout) const = 0;
protected:
/// @internal Constructor.
///
@@ -43,7 +43,7 @@
/// compatible, it is assumed that the client will default-initialize all structs.
///
/// @ingroup LibInit
#define PAL_INTERFACE_MAJOR_VERSION 960
#define PAL_INTERFACE_MAJOR_VERSION 942
/// Minimum major interface version. This is the minimum interface version PAL supports in order to support backward
/// compatibility. When it is equal to PAL_INTERFACE_MAJOR_VERSION, only the latest interface version is supported.
@@ -112,8 +112,6 @@ enum class NullGpuId : uint32
Navi44, ///< 12.0.0
Navi48, ///< 12.0.1
#if (PAL_CLIENT_INTERFACE_MAJOR_VERSION>= 888)
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 958
#endif
#endif
Max, ///< The maximum count of null devices.
All, ///< If you want to enumerate all null devices.
@@ -343,9 +343,12 @@ struct ThreadTraceInfo
uint32 threadTraceTokenConfig : 1;
uint32 threadTraceStallAllSimds : 1;
uint32 threadTraceExcludeNonDetailShaderData : 1;
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION >= 899
uint32 threadTraceEnableExecPop : 1;
uint32 placeholder3 : 1;
uint32 reserved : 15;
#else
uint32 placeholder2 : 1;
#endif
uint32 reserved : 16;
};
uint32 u32All;
} optionFlags;
@@ -370,7 +373,9 @@ struct ThreadTraceInfo
uint32 threadTraceStallBehavior;
bool threadTraceStallAllSimds;
bool threadTraceExcludeNonDetailShaderData;
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION >= 899
bool threadTraceEnableExecPop;
#endif
} optionValues;
};
@@ -218,15 +218,10 @@ enum class DepthClampMode : uint32
{
Viewport = 0x0, ///< Clamps to the viewport min/max depth bounds
_None = 0x1, ///< Disables depth clamping
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 950
#if PAL_BUILD_SUPPORT_DEPTHCLAMPMODE_ZERO_TO_ONE
ZeroToOne = 0x2, ///< Clamps between 0.0 and 1.0.
UserDefined = 0x3, ///< Clamps based on ViewportParams::userDepthClamp.
#else
UserDefined = 0x2, ///< Clamps based on ViewportParams::userDepthClamp.
#endif
/// @note Do not add entries 0x4 or higher. DynamicGraphicsState::depthClampMode is a 2-bit field.
// Unfortunately for Linux clients, X.h includes a "#define None 0" macro. Clients have their choice of either
// undefing None before including this header or using _None when dealing with PAL.
#ifndef None
@@ -419,20 +414,12 @@ struct GraphicsPipelineCreateInfo
size_t pipelineBinarySize; ///< Size of Pipeline ELF binary in bytes.
const IShaderLibrary** ppShaderLibraries; ///< An array of graphics @ref IShaderLibrary object. pPipelineBinary
/// and ppShaderLibraries can't be valid at the same time.
/// If the client does not know whether the pipeline is complete,
/// it can add the shader library for a "dummy partial pipeline" to
/// the end of the array to ensure the pipeline is complete.
/// In practice, "complete" means "has a PS on hardware that requires
/// it", although that is an implementation detail that the client
/// does not need to know.
size_t numShaderLibraries; ///< Number of graphics shaderLibrary object in ppShaderLibraries.
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 959
bool useLateAllocVsLimit; ///< If set, use the specified lateAllocVsLimit instead of PAL internally
/// determining the limit.
uint32 lateAllocVsLimit; ///< The number of VS waves that can be in flight without having param
/// cache and position buffer space. If useLateAllocVsLimit flag is set,
/// PAL will use this limit instead of the PAL-specified limit.
#endif
bool useLateAllocGsLimit; ///< If set, use the specified lateAllocVsLimit instead of PAL internally
/// determining the limit.
uint32 lateAllocGsLimit; ///< Controls GS LateAlloc val (for pos/prim allocations NOT param cache)
@@ -168,7 +168,6 @@ enum class ApplicationProfileClient : uint32
Count
};
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 948
/// Describes a primary surface view
///
/// @see IPlatform::GetPrimaryLayout()
@@ -199,7 +198,6 @@ struct GetPrimaryLayoutOutput
uint32 u32All; ///< Flags packed as 32-bit uint.
} flags; ///< specifies primary surface layout flags.
};
#endif
/// Specifies TurboSync control mode
enum class TurboSyncControlMode : uint32
@@ -465,7 +463,6 @@ public:
/// @returns A reference to a PalPlatformSettings structure.
virtual const PalPlatformSettings& PlatformSettings() const = 0;
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < 948
/// Get primary surface layout based upon VidPnSource provided by client.
///
/// This function is used by client to query the layout of the primary surface. The layout describes how primary
@@ -484,10 +481,9 @@ public:
/// + ErrorInvalidValue if pPrimaryLayoutOutput is invalid.
/// + ErrorUnavailable if no implementation on current platform.
/// + ErrorOutOfMemory if there is not enough system memory.
inline Result GetPrimaryLayout(
virtual Result GetPrimaryLayout(
uint32 vidPnSourceId,
GetPrimaryLayoutOutput* pPrimaryLayoutOutput) { return Result::ErrorUnavailable; }
#endif
GetPrimaryLayoutOutput* pPrimaryLayoutOutput) = 0;
/// Calls TurboSyncControl escape to control TurboSync on specific vidPnSourceId.
///
@@ -95,8 +95,6 @@ struct LibraryInfo
PipelineHash internalLibraryHash; ///< 128-bit identifier extracted from this library's ELF binary, composed of
/// the state the compiler decided was appropriate to identify the compiled
/// library. The lower 64 bits are "stable"; the upper 64 bits are "unique".
Util::StringView<char> colorExports; ///< For a Graphics Partial Pipeline pixel shader, an opaque
/// string to pass to the compiler to build the color export shader.
};
/// Reports shader stats. Multiple bits set in the shader stage mask indicates that multiple shaders have been combined
@@ -147,9 +147,7 @@ public:
Pal::Result UnregisterElfBinary(const ElfBinaryInfo& elfBinaryInfo);
// ==== Base Class Overrides =================================================================================== //
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < COMPRESSION_ARG_VERSION
virtual void OnConfigUpdated(DevDriver::StructuredValue* pJsonConfig) override { }
#endif
virtual Pal::uint64 QueryGpuWorkMask() const override { return 0; }
@@ -278,7 +278,11 @@ struct GpaSampleConfig
Pal::uint32 stallAllSimds : 1; ///< Stall all SIMDs for thread trace stall.
Pal::uint32 excludeNonDetailShaderData : 1; ///< Only emit shader tokens from the SIMD that have been
/// selected for detail instruction tracing
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION >= 899
Pal::uint32 enableExecPopTokens : 1; ///< Output exec tokens
#else
Pal::uint32 placeholder2 : 1;
#endif
Pal::uint32 reserved : 25; ///< Reserved for future use.
};
Pal::uint32 u32All; ///< Bit flags packed as uint32.
@@ -186,9 +186,7 @@ public:
bool IsTimingInProgress() const;
// ==== Base Class Overrides =================================================================================== //
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION < COMPRESSION_ARG_VERSION
virtual void OnConfigUpdated(DevDriver::StructuredValue* pJsonConfig) override { }
#endif
virtual void OnConfigUpdated(DevDriver::StructuredValue* pJsonConfig) override { };
virtual Pal::uint64 QueryGpuWorkMask() const override { return 0; }
@@ -97,11 +97,6 @@ public:
/// the trace controller may advance its state.
void RecordRenderOps(Pal::IQueue* pQueue, const RenderOpCounts& renderOpCounts);
// Force a controller update
virtual void OnUpdated() override { OnRenderOpUpdated(0); }
virtual Pal::IQueue* GetTraceQueue() const override { return m_pQueue; }
private:
/// Controls whether the trace proceeds on absolute render op counts or relative
enum class CaptureMode : Pal::uint8
@@ -38,7 +38,6 @@
#include "palHashMap.h"
#include "palMutex.h"
#include "palPipeline.h"
#include "palQueue.h"
#include "palSysMemory.h"
#include "palGpuMemory.h"
#include "palMemTrackerImpl.h"
@@ -57,7 +56,6 @@ class StructuredValue;
namespace GpuUtil
{
class TraceSession;
class ITraceController;
class ITraceSource;
@@ -84,18 +82,17 @@ enum class TraceSessionState : Pal::uint32
Ready = 0, ///< New trace ready to begin
Requested = 1, ///< A trace has been requested and awaiting acceptance
Preparing = 2, ///< Trace has been accepted and is preparing resources before beginning
Beginning = 3, ///< Commands are now being submitted to the GPU to begin tracing
Running = 4, ///< Trace is in progress
Running = 3, ///< Trace is in progress
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION >= 939
Postamble = 5, ///< The detailed frame trace has ended but its data has not yet been written
Postamble = 4, ///< The detailed frame trace has ended but its data has not yet been written
/// into the session. Some trace sources may still collect data during this time.
PostambleWaiting = 6, ///< Waiting for Postamble to complete.
Completed = 7, ///< Trace has fully completed. RDF trace data is ready to be pulled out by CollectTrace().
Count = 8
#else
Waiting = 5, ///< Trace has ended, but data has not been written into the session
PostambleWaiting = 5, ///< Waiting for Postamble to complete.
Completed = 6, ///< Trace has fully completed. RDF trace data is ready to be pulled out by CollectTrace().
Count = 7
#else
Waiting = 4, ///< Trace has ended, but data has not been written into the session
Completed = 5, ///< Trace has fully completed. RDF trace data is ready to be pulled out by CollectTrace().
Count = 6
#endif
};
@@ -118,12 +115,6 @@ struct TraceErrorHeader
constexpr char ErrorChunkTextIdentifier[TextIdentifierSize] = "TraceError";
constexpr Pal::uint32 ErrorTraceChunkVersion = 1;
/// Function type for TraceSession state change callback
typedef void (PAL_STDCALL *TraceStateChangeCallback)(
const TraceSession& pTraceSession,
TraceSessionState newState,
void* pPrivateData);
/**
***********************************************************************************************************************
* @interface ITraceController
@@ -252,25 +243,8 @@ public:
virtual Pal::Result OnEndPostambleGpuWork(
Pal::uint32 gpuIndex,
Pal::ICmdBuffer** ppCmdBuf) = 0;
/// Called by the associated session to force a controller update and drive the session to completion when there
/// is an insufficient number of update events to accomplish that. This is primarily used in single frame/dispatch
/// captures, during which, the controller won't be automatically updated and we have to force it to return the
/// trace session to a clean state.
virtual void OnUpdated() = 0;
/// Returns the queue tracked in the active trace controller
///
/// Returns the queue used for submitting begin and end-trace gpu-work. The queue is tracked by the active
/// controller
///
/// @returns A valid queue pointer used for submitting gpu-work
//// Or a nullptr if no such queue exists
virtual Pal::IQueue* GetTraceQueue() const = 0;
};
#define COMPRESSION_ARG_VERSION 949
/**
***********************************************************************************************************************
* @interface ITraceSource
@@ -284,23 +258,10 @@ public:
class ITraceSource
{
public:
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION >= COMPRESSION_ARG_VERSION
/// Base class constructor
ITraceSource() : m_useCompression(false)
{ }
/// Called by the associated session to update the current trace configuration. Will parse out common config options
/// then pass to OnConfigUpdated to allow derived classes to parse other options.
///
/// @param [in] pJsonConfig Configuration data formatted as json and stored as DevDriver's StructuredValue object
void OnConfigUpdated(DevDriver::StructuredValue* pJsonConfig);
#else
/// Called by the associated session to update the current trace configuration
///
/// @param [in] pJsonConfig Configuration data formatted as json and stored as DevDriver's StructuredValue object
virtual void OnConfigUpdated(DevDriver::StructuredValue* pJsonConfig) = 0;
#endif
/// Returns a bitmask that represents which GPUs are relevant to this trace source
///
@@ -394,17 +355,6 @@ public:
///
/// @returns true if multiple instances of this trace sources can co-exist in one session, false otherwise.
virtual bool AllowMultipleInstances() const { return false; }
#if PAL_CLIENT_INTERFACE_MAJOR_VERSION >= COMPRESSION_ARG_VERSION
protected:
/// Called by OnConfigUpdated to allow derived classes to update the current trace configuration.
/// Default implementation is empty.
///
/// @param [in] pJsonConfig Configuration data formatted as json and stored as DevDriver's StructuredValue object
virtual void OnConfigUpdatedDerived(DevDriver::StructuredValue* pJsonConfig) { }
bool m_useCompression;
#endif
};
/**
@@ -482,12 +432,6 @@ public:
/// + ErrorUnknown if an internal PAL error occurs.
Pal::Result CancelTrace();
/// Cancels an invalid trace in progress.
///
/// Cancels traces that have not been cleanly collected cleanly or actively canceled and returns the trace session
/// to a clean state. It forces a controller update, drives the session to completion and discards any trace data.
void CancelInvalidTrace();
/// Cleans up the RDF chunk stream and makes it ready for a new trace again.
///
/// @returns Success if the trace session and rdf streams were successfully cleaned up and returned to the
@@ -685,7 +629,10 @@ public:
/// Sets the TraceSession state based on external operations
///
/// @param [in] sessionState TraceSessionState value to be assigned as the current state
void SetTraceSessionState(TraceSessionState sessionState);
void SetTraceSessionState(TraceSessionState sessionState)
{
m_sessionState = sessionState;
}
/// Returns the current active controller
///
@@ -743,28 +690,6 @@ public:
/// @return true if a cancelation is in progress.
bool IsCancelingTrace() const { return m_cancelingTrace; }
/// Register a function to be called when the Trace Session state changes.
///
/// @param [in] pfnCallback The function to be called
/// @param [in] pPrivateData A pointer to pass to the callback function when called
///
/// @returns Success if the callback was successfully registered
/// AlreadyExists if the given Callback+PrivateData has already been registered
/// ErrorInvalidValue if the given callback is not valid
Pal::Result RegisterTraceStateChangeCallback(
TraceStateChangeCallback pfnCallback,
void* pPrivateData);
/// Unregister a previously registered Trace Session state change callback.
///
/// @param [in] pfnCallback The function which was previously registered as a callback
/// @param [in] pPrivateData The pointer which is associated with the callback to unregister
///
/// @returns Success if the callback was successfully unregistered
/// NotFound if the given pfnCallback+pPrivateData pair was not found
Pal::Result UnregisterTraceStateChangeCallback(
TraceStateChangeCallback pfnCallback,
void* pPrivateData);
private:
typedef Pal::IPlatform TraceAllocator;
@@ -808,22 +733,5 @@ private:
size_t m_configDataSize; // Size of the cached trace config buffer
bool m_cancelingTrace; // Indicates that a cancel signal has been received and trace cancelation
// is in progress.
Util::Mutex m_stateChangeCallbackLock; // RW lock for state change callbacks
// Default capacity for the Trace Session state change callback vector
static constexpr Pal::uint32 TraceStateChangeCallbacksVecDefaultCapacity = 4;
/// The data required to call a state change callback
struct TraceStateChangeCallbackInfo
{
TraceStateChangeCallback pfnCallback;
void* pPrivateData;
};
using TraceStateChangeCallbacksVec = Util::Vector<TraceStateChangeCallbackInfo,
TraceStateChangeCallbacksVecDefaultCapacity,
TraceAllocator>;
TraceStateChangeCallbacksVec m_traceStateChangeCallbacks; // Registered state change callbacks
};
} // GpuUtil
@@ -53,11 +53,11 @@ template<typename Key,
typename AllocFunc,
size_t GroupSize> class HashBase;
/// Pointer hash functor.
/// Default hash functor.
///
/// Just directly returns bits 31-6 of the key's first dword. This is a decent hash if the key is a pointer.
template<typename Key>
struct PointerHashFunc
struct DefaultHashFunc
{
/// Shifts the key to the right and use the resulting bits as a uint hash.
///
@@ -74,7 +74,7 @@ struct PointerHashFunc
void Init(uint32 minNumBits) const
{
PAL_ASSERT((Min(sizeof(Key), sizeof(uint32)) * 8) >= (minNumBits + ShiftNum));
static_assert(std::is_pointer_v<Key>, "Usage of PointerHashFunc for non-pointer types!");
PAL_ALERT_MSG(sizeof(Key) > sizeof(void*), "Usage of DefaultHashFunc for non-pointer types!");
}
};
@@ -147,9 +147,6 @@ struct StringEqualFunc
bool operator()(const Key& key1, const Key& key2) const;
};
template<typename Key>
using DefaultHashFunc = std::conditional_t<std::is_pointer_v<Key>, PointerHashFunc<Key>, JenkinsHashFunc<Key>>;
/**
***********************************************************************************************************************
* @brief Fixed-size, growable, and lazy-free memory pool allocator.
@@ -372,20 +369,6 @@ public:
/// Empty the hash container.
void Reset();
/// Removes an entry that matches the specified key.
///
/// @param [in] key Key of the entry to erase.
///
/// @returns True if the erase completed successfully, false if an entry for this key did not exist.
bool Erase(const Key& key);
/// Returns true if the specified key exists in the set.
///
/// @param [in] key Key to search for.
///
/// @returns True if the specified key exists in the set.
bool Contains(const Key& key) const;
protected:
/// @internal Constructor
///
@@ -393,7 +376,7 @@ protected:
/// take (buckets * GroupSize) bytes.
/// @param [in] pAllocator The allocator that will allocate memory if required.
explicit HashBase(uint32 numBuckets, Allocator*const pAllocator);
~HashBase() { PAL_SAFE_FREE(m_pMemory, &m_allocator); }
virtual ~HashBase() { PAL_SAFE_FREE(m_pMemory, &m_allocator); }
/// @internal Ensures that the hash table has been allocated, then finds the bucket that matches
/// the specified key
@@ -412,24 +395,6 @@ protected:
/// @returns Pointer to the bucket corresponding to the specified key.
Entry* FindBucket(const Key& key) const;
/// @internal Finds a given entry.
///
/// @param [in] key Key to find matching bucket for.
///
/// @returns Pointer to the entry corresponding to the specified key or nullptr.
Entry* FindEntry(const Key& key) const;
/// @internal Finds a given entry; if no entry was found, allocate it.
///
/// @param [in] key Key to search for.
/// @param [out] pExisted True if an entry for the specified key existed before this call was made. False indicates
/// that a new entry was allocated as a result of this call.
/// @param [out] ppValue Readable/writeable value in the hash map corresponding to the specified key.
///
/// @returns @ref Success if the operation completed successfully, or @ref ErrorOutOfMemory if the operation failed
/// because an internal memory allocation failed.
Result FindAllocateEntry(const Key& key, bool* pExisted, Entry** ppValue);
/// @internal Returns pointer to the next group of the specified group.
///
/// @param [in] pGroup Current group to find next group for.
@@ -37,9 +37,9 @@ namespace Util
{
// =====================================================================================================================
// Hash function for pointers. Simply shift the key to the right and use the resulting bits as the hash.
// Default hash function implementation. Simply shift the key to the right and use the resulting bits as the hash.
template<typename Key>
uint32 PointerHashFunc<Key>::operator()(
uint32 DefaultHashFunc<Key>::operator()(
const void* pVoidKey,
uint32 keyLen
) const
@@ -460,84 +460,6 @@ void HashBase<Key, Entry, Allocator, HashFunc, EqualFunc, AllocFunc, GroupSize>:
m_allocator.Reset();
}
// =====================================================================================================================
// Removes an entry with the specified key.
template<typename Key,
typename Entry,
typename Allocator,
typename HashFunc,
typename EqualFunc,
typename AllocFunc,
size_t GroupSize>
bool HashBase<Key, Entry, Allocator, HashFunc, EqualFunc, AllocFunc, GroupSize>::Erase(
const Key& key)
{
// Get the bucket base address.
Entry* pGroup = this->FindBucket(key);
Entry* pFoundEntry = nullptr;
Entry* pLastEntry = nullptr;
Entry* pLastEntryGroup = nullptr;
// Find the entry to delete
while (pGroup != nullptr)
{
const uint32 numEntries = this->GetGroupFooterNumEntries(pGroup);
// Search each group
uint32 i = 0;
for (; i < numEntries; i++)
{
if (this->m_equalFunc(pGroup[i].key, key) == true)
{
// We shouldn't find the same key twice.
PAL_ASSERT(pFoundEntry == nullptr);
pFoundEntry = &(pGroup[i]);
}
// keep track of last entry of all groups in bucket
pLastEntry = &(pGroup[i]);
pLastEntryGroup = pGroup;
}
// Chain to the next entry group.
pGroup = this->GetNextGroup(pGroup);
}
// Copy the last entry's data into the entry that we are removing and invalidate the last entry as it now appears
// earlier in the list. This also handles the case where the entry to be removed is the last entry.
if (pFoundEntry != nullptr)
{
PAL_ASSERT(pLastEntry != nullptr);
*pFoundEntry = std::move(*pLastEntry);
memset(pLastEntry, 0, sizeof(Entry));
PAL_ASSERT(this->m_numEntries > 0);
this->m_numEntries--;
const uint32 numEntries = this->GetGroupFooterNumEntries(pLastEntryGroup);
this->SetGroupFooterNumEntries(pLastEntryGroup, numEntries - 1);
}
return (pFoundEntry != nullptr);
}
// =====================================================================================================================
// Check if the given hashtable contains the given key.
template<typename Key,
typename Entry,
typename Allocator,
typename HashFunc,
typename EqualFunc,
typename AllocFunc,
size_t GroupSize>
bool HashBase<Key, Entry, Allocator, HashFunc, EqualFunc, AllocFunc, GroupSize>::Contains(
const Key& key) const
{
return FindEntry(key) != nullptr;
}
// =====================================================================================================================
// Ensures that the hash table has been allocated, then returns pointer to start group of the bucket
// corresponding to the specified key. A return of nullptr means out of memory.
@@ -578,122 +500,6 @@ Entry* HashBase<Key, Entry, Allocator, HashFunc, EqualFunc, AllocFunc, GroupSize
return (m_pMemory != nullptr) ? static_cast<Entry*>(VoidPtrInc(m_pMemory, bucket * GroupSize)) : nullptr;
}
// =====================================================================================================================
// Gets a pointer to the entry that matches the key. Returns null if no entry is present matching the specified key.
template<typename Key,
typename Entry,
typename Allocator,
typename HashFunc,
typename EqualFunc,
typename AllocFunc,
size_t GroupSize>
Entry* HashBase<Key, Entry, Allocator, HashFunc, EqualFunc, AllocFunc, GroupSize>::FindEntry(
const Key& key
) const
{
// Get the bucket base address.
Entry* pGroup = this->FindBucket(key);
Entry* pMatchingEntry = nullptr;
while (pGroup != nullptr)
{
const uint32 numEntries = this->GetGroupFooterNumEntries(pGroup);
// Search this entry group
uint32 i = 0;
for (; i < numEntries; i++)
{
if (this->m_equalFunc(pGroup[i].key, key))
{
// We've found the entry.
pMatchingEntry = &(pGroup[i]);
break;
}
}
if ((pMatchingEntry != nullptr) || (i < EntriesInGroup))
{
break;
}
// Chain to the next entry group.
pGroup = this->GetNextGroup(pGroup);
}
return pMatchingEntry;
}
// =====================================================================================================================
// Gets a pointer to the entry that matches the key. If the key is not present, a pointer to empty space for the value
// is returned.
template<typename Key,
typename Entry,
typename Allocator,
typename HashFunc,
typename EqualFunc,
typename AllocFunc,
size_t GroupSize>
Result HashBase<Key, Entry, Allocator, HashFunc, EqualFunc, AllocFunc, GroupSize>::FindAllocateEntry(
const Key& key, // Key to search for.
bool* pExisted, // [out] True if a matching key was found.
Entry** ppEntry) // [out] Pointer to the value entry of the hash map's entry for the specified key.
{
PAL_ASSERT(pExisted != nullptr);
PAL_ASSERT(ppEntry != nullptr);
Result result = Result::ErrorOutOfMemory;
// Get the bucket base address....
Entry* pGroup = this->InitAndFindBucket(key);
*pExisted = false;
*ppEntry = nullptr;
Entry* pMatchingEntry = nullptr;
while (pGroup != nullptr)
{
const uint32 numEntries = this->GetGroupFooterNumEntries(pGroup);
// Search this entry group.
uint32 i = 0;
for (; i < numEntries; i++)
{
if (this->m_equalFunc(pGroup[i].key, key))
{
// We've found the entry.
pMatchingEntry = &(pGroup[i]);
*pExisted = true;
break;
}
}
// We've reached the end of the allocated buckets and the entry was not found.
// Allocate this entry for the key.
if ((pMatchingEntry == nullptr) && (i < EntriesInGroup))
{
pGroup[i].key = key;
pMatchingEntry = &(pGroup[i]);
this->m_numEntries++;
this->SetGroupFooterNumEntries(pGroup, numEntries + 1);
}
if (pMatchingEntry != nullptr)
{
*ppEntry= pMatchingEntry;
result = Result::Success;
break;
}
// Chain to the next entry group.
pGroup = this->AllocateNextGroup(pGroup);
}
PAL_ASSERT(result == Result::Success);
return result;
}
// =====================================================================================================================
// Returns pointer to the next group of the spcified group.
template<
@@ -57,8 +57,7 @@ struct HashMapEntry
*
* HashFunc is a functor for hashing keys. Built-in choices for HashFunc are:
*
* - DefaultHashFunc: Default hash function, selects best hash function based on type of key.
* - PointerHashFunc: Good choice when the key is a pointer.
* - DefaultHashFunc: Good choice when the key is a pointer.
* - JenkinsHashFunc: Good choice when the key is arbitrary binary data.
* - StringJenkinsHashFunc: Good choice when the key is a C-style string.
*
@@ -93,7 +92,7 @@ public:
/// take (buckets * GroupSize) bytes.
/// @param [in] pAllocator Pointer to an allocator that will create system memory requested by this hash container.
explicit HashMap(uint32 numBuckets, Allocator*const pAllocator): Base::HashBase(numBuckets, pAllocator) { }
~HashMap() { }
virtual ~HashMap() { }
/// Finds a given entry; if no entry was found, allocate it.
///
@@ -125,6 +124,13 @@ public:
/// because an internal memory allocation failed.
Result Insert(const Key& key, const Value& value);
/// Removes an entry that matches the specified key.
///
/// @param [in] key Key of the entry to erase.
///
/// @returns True if the erase completed successfully, false if an entry for this key did not exist.
bool Erase(const Key& key);
private:
// Typedef for the specialized 'HashBase' object we're inheriting from so we can use properly qualified names when
// accessing members of HashBase.
@@ -55,12 +55,55 @@ Result HashMap<Key, Value, Allocator, HashFunc, EqualFunc, AllocFunc, GroupSize>
PAL_ASSERT(pExisted != nullptr);
PAL_ASSERT(ppValue != nullptr);
Entry* pEntry = nullptr;
Result result = Base::FindAllocateEntry(key, pExisted, &pEntry);
if (result == Result::Success)
Result result = Result::ErrorOutOfMemory;
// Get the bucket base address....
Entry* pGroup = this->InitAndFindBucket(key);
*pExisted = false;
*ppValue = nullptr;
Entry* pMatchingEntry = nullptr;
while (pGroup != nullptr)
{
*ppValue = &pEntry->value;
const uint32 numEntries = this->GetGroupFooterNumEntries(pGroup);
// Search this entry group.
uint32 i = 0;
for (; i < numEntries; i++)
{
if (this->m_equalFunc(pGroup[i].key, key))
{
// We've found the entry.
pMatchingEntry = &(pGroup[i]);
*pExisted = true;
break;
}
}
// We've reached the end of the allocated buckets and the entry was not found.
// Allocate this entry for the key.
if ((pMatchingEntry == nullptr) && (i < Base::EntriesInGroup))
{
pGroup[i].key = key;
pMatchingEntry = &(pGroup[i]);
this->m_numEntries++;
this->SetGroupFooterNumEntries(pGroup, numEntries + 1);
}
if (pMatchingEntry != nullptr)
{
*ppValue = &(pMatchingEntry->value);
result = Result::Success;
break;
}
// Chain to the next entry group.
pGroup = this->AllocateNextGroup(pGroup);
}
PAL_ASSERT(result == Result::Success);
return result;
}
@@ -78,8 +121,36 @@ Value* HashMap<Key, Value, Allocator, HashFunc, EqualFunc, AllocFunc, GroupSize>
const Key& key
) const
{
Entry* pEntry = Base::FindEntry(key);
return (pEntry != nullptr) ? &pEntry->value : nullptr;
// Get the bucket base address.
Entry* pGroup = this->FindBucket(key);
Entry* pMatchingEntry = nullptr;
while (pGroup != nullptr)
{
const uint32 numEntries = this->GetGroupFooterNumEntries(pGroup);
// Search this entry group
uint32 i = 0;
for (; i < numEntries; i++)
{
if (this->m_equalFunc(pGroup[i].key, key))
{
// We've found the entry.
pMatchingEntry = &(pGroup[i]);
break;
}
}
if ((pMatchingEntry != nullptr) || (i < Base::EntriesInGroup))
{
break;
}
// Chain to the next entry group.
pGroup = this->GetNextGroup(pGroup);
}
return (pMatchingEntry != nullptr) ? &(pMatchingEntry->value) : nullptr;
}
// =====================================================================================================================
@@ -96,14 +167,14 @@ Result HashMap<Key, Value, Allocator, HashFunc, EqualFunc, AllocFunc, GroupSize>
const Value& value)
{
bool existed = true;
Entry* pEntry = nullptr;
Value* pValue = nullptr;
Result result = Base::FindAllocateEntry(key, &existed, &pEntry);
Result result = FindAllocate(key, &existed, &pValue);
// Add the new value if it did not exist already. If FindAllocate returns Success, pValue != nullptr.
if ((result == Result::Success) && (existed == false))
{
pEntry->value = value;
*pValue = value;
}
PAL_ASSERT(result == Result::Success);
@@ -111,4 +182,69 @@ Result HashMap<Key, Value, Allocator, HashFunc, EqualFunc, AllocFunc, GroupSize>
return result;
}
// =====================================================================================================================
// Removes an entry with the specified key.
template<typename Key,
typename Value,
typename Allocator,
template<typename> class HashFunc,
template<typename> class EqualFunc,
typename AllocFunc,
size_t GroupSize>
bool HashMap<Key, Value, Allocator, HashFunc, EqualFunc, AllocFunc, GroupSize>::Erase(
const Key& key)
{
// Get the bucket base address.
Entry* pGroup = this->FindBucket(key);
Entry* pFoundEntry = nullptr;
Entry* pLastEntry = nullptr;
Entry* pLastEntryGroup = nullptr;
// Find the entry to delete
while (pGroup != nullptr)
{
const uint32 numEntries = this->GetGroupFooterNumEntries(pGroup);
// Search each group
uint32 i = 0;
for (; i < numEntries; i++)
{
if (this->m_equalFunc(pGroup[i].key, key) == true)
{
// We shouldn't find the same key twice.
PAL_ASSERT(pFoundEntry == nullptr);
pFoundEntry = &(pGroup[i]);
}
// keep track of last entry of all groups in bucket
pLastEntry = &(pGroup[i]);
pLastEntryGroup = pGroup;
}
// Chain to the next entry group.
pGroup = this->GetNextGroup(pGroup);
}
// Copy the last entry's data into the entry that we are removing and invalidate the last entry as it now appears
// earlier in the list. This also handles the case where the entry to be removed is the last entry.
if (pFoundEntry != nullptr)
{
PAL_ASSERT(pLastEntry != nullptr);
pFoundEntry->key = pLastEntry->key;
pFoundEntry->value = pLastEntry->value;
memset(pLastEntry, 0, sizeof(Entry));
PAL_ASSERT(this->m_numEntries > 0);
this->m_numEntries--;
const uint32 numEntries = this->GetGroupFooterNumEntries(pLastEntryGroup);
this->SetGroupFooterNumEntries(pLastEntryGroup, numEntries - 1);
}
return (pFoundEntry != nullptr);
}
} // Util
@@ -56,8 +56,7 @@ struct HashSetEntry
*
* HashFunc is a functor for hashing keys. Built-in choices for HashFunc are:
*
* - DefaultHashFunc: Default hash function, selects best hash function based on type of key.
* - PointerHashFunc: Good choice when the key is a pointer.
* - DefaultHashFunc: Good choice when the key is a pointer.
* - JenkinsHashFunc: Good choice when the key is arbitrary binary data.
* - StringJenkinsHashFunc: Good choice when the key is a C-style string.
*
@@ -97,7 +96,7 @@ public:
/// take (buckets * GroupSize) bytes.
/// @param [in] pAllocator Pointer to an allocator that will create system memory requested by this hash container.
explicit HashSet(uint32 numBuckets, Allocator*const pAllocator) : Base::HashBase(numBuckets, pAllocator) {}
~HashSet() { }
virtual ~HashSet() { }
/// Finds a given entry; if no entry was found, allocate it.
///
@@ -109,6 +108,13 @@ public:
/// @ref ErrorOutOfMemory if the operation failed because an internal memory allocation failed.
Result FindAllocate(Key** ppKey, bool* pExisted);
/// Returns true if the specified key exists in the set.
///
/// @param [in] key Key to search for.
///
/// @returns True if the specified key exists in the set.
bool Contains(const Key& key) const;
/// Inserts an entry.
///
/// No action will be taken if an entry matching this key already exists in the set.
@@ -119,6 +125,13 @@ public:
/// because an internal memory allocation failed.
Result Insert(const Key& key);
/// Removes an entry that matches the specified key.
///
/// @param [in] key Key of the entry to erase.
///
/// @returns True if the erase completed successfully, false if an entry for this key did not exist.
bool Erase(const Key& key);
private:
// Typedef for the specialized 'HashBase' object we're inheriting from so we can use properly qualified names when
// accessing members of HashBase.
@@ -48,9 +48,14 @@ template<typename Key,
Result HashSet<Key, Allocator, HashFunc, EqualFunc, AllocFunc, GroupSize>::Insert(
const Key& key)
{
Entry* pEntry = nullptr;
bool existed = false;
return Base::FindAllocateEntry(key, &existed, &pEntry);
Key* pKey = const_cast<Key*>(&key);
bool existed;
const Result result = FindAllocate(&pKey, &existed);
if (existed == false)
{
*pKey = key;
}
return result;
}
// =====================================================================================================================
@@ -68,8 +73,159 @@ Result HashSet<Key, Allocator, HashFunc, EqualFunc, AllocFunc, GroupSize>::FindA
PAL_ASSERT(ppKey != nullptr);
PAL_ASSERT(pExisted != nullptr);
static_assert(offsetof(Entry, key) == 0);
return Base::FindAllocateEntry(**ppKey, pExisted, reinterpret_cast<Entry**>(ppKey));
Result result = Result::ErrorOutOfMemory;
// Get the bucket base address.
Entry* pGroup = this->InitAndFindBucket(**ppKey);
Entry* pMatchingEntry = nullptr;
while (pGroup != nullptr)
{
const uint32 numEntries = this->GetGroupFooterNumEntries(pGroup);
// Search this entry group.
uint32 i = 0;
for (; i < numEntries; i++)
{
if (this->m_equalFunc(pGroup[i].key, **ppKey))
{
// We've found the entry.
pMatchingEntry = &(pGroup[i]);
*pExisted = true;
break;
}
}
if ((pMatchingEntry == nullptr) && (i < Base::EntriesInGroup))
{
// We've reached the end of the bucket and the entry was not found. Allocate this entry for the key.
*pExisted = false;
*ppKey = &pGroup[i].key;
pMatchingEntry = &(pGroup[i]);
this->m_numEntries++;
this->SetGroupFooterNumEntries(pGroup, numEntries + 1);
}
if (pMatchingEntry != nullptr)
{
result = Result::Success;
break;
}
// Chain to the next entry group.
pGroup = this->AllocateNextGroup(pGroup);
}
PAL_ASSERT(result == Result::Success);
return result;
}
// =====================================================================================================================
// Searches for the specified key to see if it exists.
template<typename Key,
typename Allocator,
template<typename> class HashFunc,
template<typename> class EqualFunc,
typename AllocFunc,
size_t GroupSize>
bool HashSet<Key, Allocator, HashFunc, EqualFunc, AllocFunc, GroupSize>::Contains(
const Key& key
) const
{
// Get the bucket base address.
Entry* pGroup = this->FindBucket(key);
Entry* pMatchingEntry = nullptr;
while (pGroup != nullptr)
{
const uint32 numEntries = this->GetGroupFooterNumEntries(pGroup);
// Search this entry group.
uint32 i = 0;
for (; i < numEntries; i++)
{
if (this->m_equalFunc(pGroup[i].key, key))
{
// We've found the entry.
pMatchingEntry = &(pGroup[i]);
break;
}
}
if ((pMatchingEntry != nullptr) || (i < Base::EntriesInGroup))
{
break;
}
// Chain to the next entry group.
pGroup = this->GetNextGroup(pGroup);
}
return (pMatchingEntry != nullptr);
}
// =====================================================================================================================
// Removes an entry with the specified key.
template<typename Key,
typename Allocator,
template<typename> class HashFunc,
template<typename> class EqualFunc,
typename AllocFunc,
size_t GroupSize>
bool HashSet<Key, Allocator, HashFunc, EqualFunc, AllocFunc, GroupSize>::Erase(
const Key& key)
{
// Get the bucket base address.
Entry* pGroup = this->FindBucket(key);
Entry* pFoundEntry = nullptr;
Entry* pLastEntry = nullptr;
Entry* pLastEntryGroup = nullptr;
// Find the entry to delete.
while ((pGroup != nullptr))
{
const uint32 numEntries = this->GetGroupFooterNumEntries(pGroup);
// Search this entry
uint32 i = 0;
for (; i < numEntries; i++)
{
if (this->m_equalFunc(pGroup[i].key, key) == true)
{
// We shouldn't find the same key twice.
PAL_ASSERT(pFoundEntry == nullptr);
pFoundEntry = &(pGroup[i]);
}
// keep track of last entry of all groups in bucket
pLastEntry = &(pGroup[i]);
pLastEntryGroup = pGroup;
}
// Chain to the next entry group
pGroup = this->GetNextGroup(pGroup);
}
// Copy the last entry's data into the entry that we are removing and invalidate the last entry as it now appears
// earlier in the list. This also handles the case where the entry to be removed is the last entry.
if (pFoundEntry != nullptr)
{
PAL_ASSERT(pLastEntry != nullptr);
pFoundEntry->key = pLastEntry->key;
memset(pLastEntry, 0, sizeof(Entry));
PAL_ASSERT(this->m_numEntries > 0);
this->m_numEntries--;
const uint32 numEntries = this->GetGroupFooterNumEntries(pLastEntryGroup);
this->SetGroupFooterNumEntries(pLastEntryGroup, numEntries - 1);
}
return (pFoundEntry != nullptr);
}
} // Util
@@ -99,7 +99,6 @@ static_assert(false, "Clients may not define macros named \"min\" or \"max\".");
// Equates to [__declspec(align(__x))](https://github.com/MicrosoftDocs/cpp-docs/blob/master/docs/cpp/align-cpp.md) on Windows.
#define PAL_ALIGN(__x) __declspec(align(__x))
#define PAL_FORCE_INLINE __forceinline
#define PAL_NO_INLINE __declspec(noinline)
#else
/// Undefined on GCC platforms.
#define PAL_STDCALL
@@ -108,7 +107,6 @@ static_assert(false, "Clients may not define macros named \"min\" or \"max\".");
/// Undefined on GCC platforms.
#define PAL_ALIGN(__x)
#define PAL_FORCE_INLINE __attribute__((always_inline)) inline
#define PAL_NO_INLINE __attribute__((noinline))
#endif
/// Platform cache line size in bytes.
@@ -601,16 +599,6 @@ constexpr bool IsErrorResult(Result result) { return (static_cast<int32>(result)
constexpr Result CollapseResults(Result lhs, Result rhs)
{ return (IsErrorResult(lhs) || (static_cast<uint32>(lhs) > static_cast<uint32>(rhs))) ? lhs : rhs; }
/// A simple enum-to-string helper function. Given a result like Result::ErrorOutOfMemory, it returns a pointer to a
/// global string containing "ErrorOutOfMemory". The caller must not try to free the returned string.
///
/// @param [in] result The Result code to turn into a string.
///
/// @returns A valid pointer to the appropriate global string or to "FixTheTables!!!" if someone forgot to update the
/// internal string tables when they added a new Result value. It's impossible for this to return nullptr.
extern const char* ResultToString(
Result result);
/**
***********************************************************************************************************************
* @page UtilOverview Utility Collection
@@ -1,5 +1,5 @@
outs:
- md5: 95d96350d29e5d7ee8249b13f8344bfa
- md5: fd5f7481a122f40f73d1f638e3b9b027
size: 16738
hash: md5
path: DriverUtilsService.lib
@@ -1,5 +1,5 @@
outs:
- md5: 456e1346e62388d873836eee241c2ecc
- md5: e09dbb1896128ac2b2bcac2b35878a40
size: 9460
hash: md5
path: SettingsRpcService2.lib
@@ -1,5 +1,5 @@
outs:
- md5: 26144175ebf644e9c406a84cac291898
- md5: 364bc94b5b81ef5bb337e6afb0060c55
size: 13912
hash: md5
path: UberTraceService.lib
@@ -1,5 +1,5 @@
outs:
- md5: 82c883995b5833b7c1e3456da645f1c7
size: 976846
- md5: 56362998d9feb9b0ce6ccad8441bf1c8
size: 820446
hash: md5
path: addrlib.lib
@@ -1,5 +1,5 @@
outs:
- md5: b752c646510e1b854e86d2180ce91cbb
- md5: 35af646710d883bfe6184113cb88e96a
size: 702568
hash: md5
path: amdrdf.lib
@@ -1,5 +1,5 @@
outs:
- md5: 25f322e041c71e95504e49333ba711c4
- md5: 06c7697ce380a8127e7478041aed7fc8
size: 27894
hash: md5
path: cwpack.lib
@@ -1,5 +1,5 @@
outs:
- md5: 8638cb376e4098e11bbd3a96d9de126d
- md5: 7c6ba83c44ee8bd70397a1458dbea7e0
size: 82210
hash: md5
path: ddCommon.lib
@@ -1,5 +1,5 @@
outs:
- md5: 429ad9f4c0eb7a231c97fff83ed3aac9
- md5: 32f4aa9943ab5fde0da6f09bcbacf9be
size: 72778
hash: md5
path: ddCore.lib
@@ -1,5 +1,5 @@
outs:
- md5: 1c256136b2c4b0ac352156910e159df9
size: 138010
- md5: 09fc5ce2eb8653cbd8ddda5d983ce836
size: 137794
hash: md5
path: ddEventClient.lib
@@ -1,5 +1,5 @@
outs:
- md5: 1370a8b4f3697241d7dac7d43aa9dd79
- md5: 2f5e7eb06485bebea0ec2779d8df9f97
size: 46862
hash: md5
path: ddEventParser.lib
@@ -1,5 +1,5 @@
outs:
- md5: b8622eb38d6c1468301489f98fef8e4d
- md5: bdb3738c5bbd4ac6abd3a805930b89fe
size: 30582
hash: md5
path: ddEventServer.lib
@@ -1,5 +1,5 @@
outs:
- md5: 9bfec5d057726ec5d03213ede38ae4a2
- md5: 43ea2bd7328593fab054754f9fd1a7c9
size: 35768
hash: md5
path: ddEventStreamer.lib
@@ -1,5 +1,5 @@
outs:
- md5: ce4b653d66b86c8f6f8e11bbfffea54f
- md5: d91f19ed479fd51b481f6b25566dfd31
size: 13230
hash: md5
path: ddNet.lib
@@ -1,5 +1,5 @@
outs:
- md5: 20bba474c25d34bad7c675c0d774017c
- md5: 5b37ae9cc29dfaba3cb0d08a30bd684a
size: 23224
hash: md5
path: ddRpcClient.lib
@@ -1,5 +1,5 @@
outs:
- md5: a0f3d781dc7358b33f693ad6f19284ab
- md5: b3f63ef6d9a9d6bfb3e5934a9a34465e
size: 179024
hash: md5
path: ddRpcServer.lib
@@ -1,5 +1,5 @@
outs:
- md5: 78c64bc9c07804300f848425908f3443
- md5: db87375bafb0d667ac054dbf7dc0dc36
size: 16268
hash: md5
path: ddRpcShared.lib
@@ -1,5 +1,5 @@
outs:
- md5: 4f16385611ba26f28374e11d9421269d
- md5: f88897c7d989d95f0352cf6e1a21df99
size: 106228
hash: md5
path: ddSocket.lib
@@ -1,5 +1,5 @@
outs:
- md5: c4bed6f36417e25fdb069616b2ef7edd
- md5: 4d64cf4c3b034f09a8a3abd1b7e657b4
size: 35902
hash: md5
path: ddYaml.lib
@@ -1,5 +1,5 @@
outs:
- md5: cb9ece645ccf22601a53d532614422c8
size: 663402
- md5: 2394b7141b71f0b738dd3ad024dcbfc0
size: 661222
hash: md5
path: dd_common.lib
@@ -1,5 +1,5 @@
outs:
- md5: 5d4c03c414cd2661e27a9d661b2aaaa0
- md5: 2b9f0af04b216527b49338cc1b8fa1a5
size: 264022
hash: md5
path: dd_libyaml.lib
@@ -1,5 +1,5 @@
outs:
- md5: 5de26d6d7739cdad888688eb4937fcb1
size: 213370
- md5: d9a1105679db9411bf2365aae2b6d2a5
size: 212936
hash: md5
path: dd_settings.lib
@@ -1,5 +1,5 @@
outs:
- md5: 25a99e1ae065b735a3644ece1fc9c0c4
size: 2701484
- md5: 18e7d04c4ecc9fb872de2e0ac9dffd61
size: 2700190
hash: md5
path: devdriver.lib
@@ -1,5 +1,5 @@
outs:
- md5: 53cad99c6cd2848fb4e283db250e1148
- md5: 64dbdb1c2d7c68e7ae3083ea35878a83
size: 28682
hash: md5
path: metrohash.lib
@@ -1,5 +1,5 @@
outs:
- md5: 8c5d464f8f60a4285770e7994a74ba70
- md5: 43f91cf1e53eef1411a6e4a40776cd79
size: 218874
hash: md5
path: mpack.lib
@@ -1,5 +1,5 @@
outs:
- md5: 5800da924b60abf1b7ca111ac2fb1aae
size: 20625154
- md5: 852e161ac4115309a2591db0b80f13dd
size: 24025742
hash: md5
path: pal.lib
@@ -1,5 +1,5 @@
outs:
- md5: 59a339d6330fa360eb7452a15851d2d4
size: 440980
- md5: a929ad3103021925d382e419b0e5343d
size: 433780
hash: md5
path: palCompilerDeps.lib
@@ -1,5 +1,5 @@
outs:
- md5: b35b947076fbb8821eab7252511049e2
size: 831218
- md5: 3cca5923fa12cf564360058254c2c6db
size: 799750
hash: md5
path: palUtil.lib
@@ -1,5 +1,5 @@
outs:
- md5: d5ebea86c9821bd43006bea5e8fd1ce3
- md5: 36731971681f4a89f4e89b5ad44473ac
size: 291664
hash: md5
path: pal_lz4.lib
@@ -1,5 +1,5 @@
outs:
- md5: 8551962e3709c5da05736df2959b074f
- md5: 1ad5de7ebbb13b41f7d7dc0367d7d1d8
size: 3460
hash: md5
path: pal_uuid.lib
@@ -1,5 +1,5 @@
outs:
- md5: cb6d2ce450c3869437fc090fa06eb1c5
- md5: b2a1cd0f59d07aaa0cf21afa9235dbda
size: 25990
hash: md5
path: stb_sprintf.lib
@@ -1,5 +1,5 @@
outs:
- md5: 709400309f890ea3e16cf2c816dead42
- md5: 02784ea9d25a9a9c94c20acca001456c
size: 215198
hash: md5
path: vam.lib
@@ -1,5 +1,5 @@
outs:
- md5: 6af43c37bb2018208ba884ca155a3cf6
- md5: 6a1ac31db298434da1573cda69d9e4d3
size: 1356642
hash: md5
path: zstd.lib