Skip to content

Commit f2082fd

Browse files
committed
GPA 3.14 updates
1 parent 376cec1 commit f2082fd

File tree

364 files changed

+59298
-2369932
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

364 files changed

+59298
-2369932
lines changed

BUILD.md

Lines changed: 56 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -1,47 +1,49 @@
11
# GPUPerfAPI Build Instructions
22
---
33
## Table of Contents
4-
* [Cloning/Updating Dependent Repositories](#cloningupdating-dependent-repositories)
4+
* [Cloning & Updating Dependent Repositories](#cloningupdating-dependent-repositories)
55
* [Windows Build Information](#windows-build-information)
66
* [Linux Build Information](#linux-build-information)
77
* [PublicCounterCompiler Tool](#publiccountercompiler-tool)
88

9-
## Cloning/Updating Dependent Repositories
9+
## Cloning & Updating Dependent Repositories
1010
GPUPerfAPI no longer uses git submodules to reference dependent repositories. Instead, you need to follow these instructions in
11-
order to clone/update any dependent repositories.
11+
order to clone and update any dependent repositories.
1212

1313
#### Prerequisites
1414
* Python 3.x, which can be installed from https://www.python.org/.
15-
* CMake 3.19 or newer
16-
* For Windows, this can be downloaded from https://cmake.org/download/
17-
* For Linux, this can be installed using: sudo apt-get install cmake
15+
* CMake 3.19 and newer.
16+
* For Windows, this can be downloaded from https://cmake.org/download/.
17+
* For Linux, this can be installed using:
18+
* `sudo apt-get install cmake`
1819
* To build the documentation:
1920
* Install Sphinx:
20-
* `pip install Sphinx`
21+
* `pip install sphinx`
2122
* Install Sphinx Read The Docs theme:
2223
* `pip install sphinx-rtd-theme` (read the docs theme is not installed by default)
2324
* Install spelling checker (optional)
2425
* `pip install pyenchant`
2526
* `pip install sphinxcontrib-spelling`
27+
* **Note:** Be sure to add Python scripts to PATH in order to build documentation from prebuild scripts.
2628

2729
#### Instructions
28-
* Simply execute the [pre_build.py](build/pre_build.py) python script located in the GPA directory:
30+
* Simply execute the [pre_build.py](build/pre_build.py) Python script located in the GPA directory:
2931
* `python build/pre_build.py`
3032
* This script will clone any dependent repositories that are not present on the system. If any of the dependent repositories are already
3133
present on the system, this script will instead do a "git pull" on those repositories to ensure that they are up to date. Please re-run
3234
this script everytime you pull new changes from GPA repository.
33-
* NOTE: For GPA 3.3 or newer, if you are updating an existing clone of the GPA repo from a GPA release prior than 3.3, you will first need to delete the Common/Lib/Ext/GoogleTest directory. Starting with GPA 3.3, GPA is now using a fork of the official GoogleTest repo. Failure to remove this directory will lead to git errors when running [pre_build.py](build/pre_build.py) or [fetch_dependencies.py](scripts/fetch_dependencies.py).
34-
* NOTE: For GPA 3.11 and newer, the Common/ directory has been removed, and the external/Lib/Ext/GoogleTest will first need to be deleted instead.
35-
* This script will also download and execute the Vulkan SDK installer.
36-
* On Windows, running the installer may require elevation. If you've previously installed the required Vulkan version, fetch_dependencies.py will simply copy the files from the default installation location into the correct place into the GPUPerfAPI directory tree.
35+
* **Note:** For GPA 3.3 and newer, if you are updating an existing clone of the GPA repo from a GPA release prior to 3.3, you will first need to delete the Common/Lib/Ext/GoogleTest directory. Starting with GPA 3.3, GPA is now using a fork of the official GoogleTest repo. Failure to remove this directory will lead to Git errors when running [pre_build.py](build/pre_build.py) or [fetch_dependencies.py](scripts/fetch_dependencies.py).
36+
* **Note:** For GPA 3.11 and newer, the Common/ directory has been removed, and the external/Lib/Ext/GoogleTest will first need to be deleted instead.
37+
* This script will also download and execute the Vulkan SDK installer.
38+
* On Windows, running the installer may require elevation. If you've previously installed the required Vulkan version, fetch_dependencies.py will simply copy the files from the default installation location into the correct place into the GPUPerfAPI directory tree.
3739
* fetch_dependencies.py is set up to install the version of the Vulkan SDK which was used during development. If you want to use a newer version of the SDK, the following file will need to be updated:
3840
* [fetch_dependencies.py](scripts/fetch_dependencies.py)
3941
* By default the build will expect the Vulkan SDK to be found in a directory pointed to by the `VULKAN_SDK` environment variable. This environment variable is automatically set by the Windows SDK installer, but you may need to set it manually after running the Linux SDK installer. The Linux SDK includes a script called `setup-env.sh` to aid in setting this environment variable:
40-
* `source ~/VulkanSDK/1.0.68.0/setup-env.sh` (adjust path as necessary)
41-
* This script also executes cmake to generate all required files to build GPA.
42-
* If you want to generate all cmake build files without trying to clone/pull dependent repos, you can add "--nofetch" to the [pre_build.py](build/pre_build.py) command line.
42+
* `source ~/VulkanSDK/1.0.68.0/setup-env.sh` (adjust PATH as necessary)
43+
* This script also executes CMake to generate all required files to build GPA.
44+
* If you want to generate all CMake build files without trying to clone/pull dependent repos, you can add "--nofetch" to the [pre_build.py](build/pre_build.py) command line.
4345
* Additional switches that can be used with the [pre_build.py](build/pre_build.py) script:
44-
* `--vs=[2015,2017,2019,2022]`: Specify the Visual Studio version for which to generate projects. Default is 2019.
46+
* `--vs=[2015,2017,2019,2022]`: Specify the Visual Studio version for which to generate projects. Default is 2022.
4547
* `--config=[debug,release]`: Specify the config for which to generate makefiles. Default is both. A specific config can only be specified on Linux. On Windows, both configs are always supported by the generated VS solution and project files.
4648
* `--platform=[x86,x64]`: Specify the platform for which to generate build files. Default is both.
4749
* `--clean`: Deletes CMakeBuild directory and regenerates all build files from scratch
@@ -57,60 +59,67 @@ this script everytime you pull new changes from GPA repository.
5759
## Windows Build Information
5860

5961
##### Prerequisites
60-
* Microsoft Visual Studio 2019
61-
* Windows 10 SDK Version 10.0.10586.0 from https://developer.microsoft.com/en-US/windows/downloads/windows-10-sdk
62-
* You can override the version of the Windows 10 SDK used by modifying external/Lib/Ext/Windows-Kits/Global-WindowsSDK.cmake
63-
* Microsoft .NET 4.6.2 SDK from https://dotnet.microsoft.com/en-us/download/dotnet-framework/thank-you/net462-developer-pack-offline-installer
62+
* Microsoft Visual Studio 2022
63+
* Within the Visual Studio Installer, the following workloads:
64+
* Desktop development with C++
65+
* Within the Visual Studio Installer, the following individual components:
66+
* Windows 11 SDK (10.0.22621.0)
67+
* MSVC v143 - VS 2022 C++ x74/x86 build tools (latest)
68+
* C++ CMake tools for Windows
69+
* C++ Clang Compiler for Windows (15.0.1)
70+
* .NET Framework 4.6.2-4.8.1 SDKs and targeting packs
71+
* C# and Visual Basic
72+
* C# and Visual Basic Roslyn compilers
6473

6574
##### Build Instructions
66-
* Load cmake_bld\x64\GPUPerfAPI.sln into Visual Studio to build the 64-bit version of GPA
67-
* Load cmake_bld\x86\GPUPerfAPI.sln into Visual Studio to build the 32-bit version of GPA
68-
* After a successful build, the GPUPerfAPI binaries can be found in `gpu_performance_api\output\$(Configuration)` (for example gpu_performance_api\output\release)
75+
* Load cmake_bld\x64\GPUPerfAPI.sln into Visual Studio to build the 64-bit version of GPA.
76+
* Load cmake_bld\x86\GPUPerfAPI.sln into Visual Studio to build the 32-bit version of GPA.
77+
* After a successful build, the GPUPerfAPI binaries can be found in `gpu_performance_api\output\$(Configuration)` (e.g. gpu_performance_api\output\release).
6978

7079
#### Additional Information
71-
* The Windows projects each include a .rc file that embeds the VERSIONINFO resource into the final binary. Internally within AMD, a Jenkins build system will dynamically update
80+
* The Windows projects each include a .rc file that embeds the VERSIONINFO resource into the final binary. Internally within AMD, a Jenkins build system will dynamically update.
7281
the build number. The version and build numbers can be manually updated by modifying the [gpa_version.h](source/gpu_perf_api_common/gpa_version.h) file.
7382

7483
## Linux Build Information
7584

7685
##### Prerequisites
77-
* Install the Mesa common development package: sudo apt-get install mesa-common-dev
86+
* Install the Mesa common development package:
87+
* `sudo apt-get install mesa-common-dev`
7888

7989
##### Build Instructions
80-
* Execute "make" in the cmake_bld/x64/debug to build the 64-bit debug version of GPA
81-
* Execute "make" in the cmake_bld/x64/release to build the 64-bit release version of GPA
82-
* After a successful build, the GPUPerfAPI binaries can be found in `gpu_performance_api/output/$(Configuration)` (for example gpu_performance_api/output/release)
90+
* Execute "make" in the cmake_bld/x64/debug to build the 64-bit debug version of GPA.
91+
* Execute "make" in the cmake_bld/x64/release to build the 64-bit release version of GPA.
92+
* After a successful build, the GPUPerfAPI binaries can be found in `gpu_performance_api/output/$(Configuration)` (e.g. gpu_performance_api/output/release).
8393

8494
## PublicCounterCompiler Tool
85-
86-
The PublicCounterCompiler Tool is a utility, written in C#, that will generate C++ code to define the public (or derived) counters.
95+
The PublicCounterCompiler Tool is a C# utility that will generate C++ code to define the public (or derived) counters.
8796
It takes as input text files contained in the [public_counter_compiler_input_files](source/public_counter_compiler_input_files) directory and
8897
outputs files in the [gpu_perf_api_counter_generator](source/auto_generated/gpu_perf_api_counter_generator), [gpu_perf_api_unit_tests](source/auto_generated/gpu_perf_api_unit_tests)
8998
and [docs](docs) directories.
9099

91100
There are three ways to execute the tool:
92-
* With no parameters - it opens the user interface with no fields prepopulated
93-
* With two parameters - it opens the user interface with the two main fields prepopulated. When you press the "Compile Public Counters" button it will load the correct input files and generate the output files in the correct location.
94-
* Param 1: API -- the API to compile counters for (ex: GL, CL, DX11, DX12, VK, etc).
95-
* Param 2: HW generation: the generation to compile counters for (ex: Gfx6, Gfx7, Gfx8, etc.)
96-
* With six or seven parameters - the user interface does not open. It simply generates the c++ files using the specified input and output file locations
97-
* Param 1: Counter names file - text file containing hardware counter names and type (CounterNames[API][GEN].txt)
98-
* Param 2: Public counter definition file - text file defining how the public counters are calculated (PublicCounterDefinitions\*.txt)
99-
* Param 3: Output Dir - the directory to generate the output in (Ex: the path to the GPUPerfAPICounterGenerator directory)
100-
* Param 4: Test output Dir - the directory to generate the test output in (Ex: the path to the GPUPerfAPIUnitTests/counters directory)
101-
* Param 5: API - the API to take the counter names from (ex: DX12)
102-
* Param 6: GPU - the GPU to take the counter names from (ex: Gfx9)
103-
* Param 7: GPU ASIC - (optional) the subversion of GPU to take the counter names from
101+
* With no parameters - the user interface opens with no fields prepopulated.
102+
* With two parameters - the user interface opens with the two main fields prepopulated. Note that pressing the "Compile Public Counters" button will load the correct input files and generate the output files in the correct location.
103+
* Param 1: **API** - the API to compile counters for (e.g. GL, CL, DX11, DX12, VK, etc.)
104+
* Param 2: **HW generation** - the generation to compile counters for (ex: Gfx9, Gfx10, Gfx11 etc.)
105+
* With six or seven parameters - the user interface does not open and will generate the C++ files using the specified input and output file locations.
106+
* Param 1: **Counter names file** - text file containing hardware counter names and type (CounterNames[API][GEN].txt)
107+
* Param 2: **Public counter definition file** - text file defining how the public counters are calculated (PublicCounterDefinitions\*.txt)
108+
* Param 3: **Output Dir** - the directory to generate the output in (e.g. the path to the GPUPerfAPICounterGenerator directory)
109+
* Param 4: **Test output Dir** - the directory to generate the test output in (e.g. the path to the GPUPerfAPIUnitTests/counters directory)
110+
* Param 5: **API** - the API to take the counter names from (e.g. DX12)
111+
* Param 6: **GPU** - the GPU to take the counter names from (e.g. Gfx11)
112+
* Param 7: **GPU ASIC** - the subversion of GPU to take the counter names from (optional)
104113

105114
See the various `public_counter_definitions_*.txt` files in the [public_counter_compiler_input_files](source/public_counter_compiler_input_files) directory. These contain all the counter definitions.
106115
Each counter is given a name, a description, a type, an optional usage type, a list of hardware counters required and a formula applied to the values of the hardware counters to calculate the value of the counter.
107116

108117
Counter formulas are expressed in a Reverse Polish Notation and are made up the following elements:
109-
* numbers: these are zero-based counter indexes referring to individual counters within the list of hardware counters
110-
* hardware counters may also be referred to by name (e.g.: GPUTime_Bottom_To_Bottom) or templated name (e.g.: SPI*_SPI_PERF_CSG_BUSY) which will automatically refer to the correct number of instances
111-
* math operators: The supported operators are +, -, /, *
118+
* numbers: Zero-based counter indexes referring to individual counters within the list of hardware counters
119+
* math operators: Supported operators are +, -, /, *
112120
* numeric literals: Numbers contained within parentheses are numeric literals (as opposed to counter indexes)
113-
* functions: The supported functions are: min, max, sum, ifnotzero, and vcomparemax4. "max and "sum" have variants that work on multiple items at once (i.e. sum16, sum64, etc.)
114-
* hardware params: The supported hardware params are "num_shader_engines". "num_simds", "su_clock_prim", "num_prim_pipes", and "TS_FREQ"
121+
* functions: Supported functions are: min, max, sum, ifnotzero, and vcomparemax4. "max and "sum" have variants that work on multiple items at once (i.e. sum16, sum64, etc.)
122+
* hardware params: Supported hardware params are "num_shader_engines", "num_simds", "su_clock_prim", "num_prim_pipes", and "TS_FREQ"
123+
* **Note:** Hardware counters may also be referred to by name (e.g.: GPUTime_Bottom_To_Bottom) or templated name (e.g.: SPI*_SPI_PERF_CSG_BUSY) which will automatically refer to the correct number of instances.
115124

116125
For more details, see the "EvaluateExpression" function in the [gpa_derived_counter.cc](source/gpu_perf_api_counter_generator/gpa_derived_counter.cc) file.

NOTICES.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
DO NOT TRANSLATE OR LOCALIZE
22

33

4-
Third-Party Notices Report for GPUPerfAPI v3.13.1
4+
Third-Party Notices Report for GPUPerfAPI v3.14
55

66

77

README.md

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -31,12 +31,21 @@ Prebuilt binaries can be downloaded from the Releases page: https://github.com/G
3131
* Provides access to some raw hardware counters. See [Raw Hardware Counters](#raw-hardware-counters) for more information.
3232

3333
## What's New
34-
### Version 3.13.1 (06/22/2023)
35-
* Add support for additional AMD Radeon RX 7000 Series hardware.
36-
* Add support for AMD Radeon 700M Series APUs.
37-
* Vulkan and OpenGL are supported on existing drivers; DX12, DX11, and OpenCL will be enabled by an upcoming driver.
38-
* Bug Fixes:
39-
* Fixed performance regression in GPUPerfAPIDX12[-x64].dll
34+
### Version 3.14 (09/21/2023)
35+
* Added support for AMD Radeon RX 7700 XT and AMD Radeon RX 7800 XT graphics cards.
36+
* Added support for additional AMD Radeon 700M Series devices.
37+
* Added counters back to Gfx9, Gfx10, Gfx103, and Gfx11 hardware generations. These restored counters are listed below by group:
38+
* Timing: TessellatorBusy, TessellatorBusyCycles, VsGsBusy, VsGsBusyCycles, VsGsTime, PreTessellationBusy, PreTessellationBusyCycles, PreTessellationTime, PostTessellationBusy, PostTessellationBusyCycles, PostTessellationTime
39+
* VertexGeometry: VsGsVerticesIn, VsGsPrimsIn, GSVerticesOut
40+
* PreTessellation: PreTessVerticesIn
41+
* PostTessellation: PostTessPrimsOut
42+
* PrimitiveAssembly: PrimitivesIn
43+
* TextureUnit: TexTriFilteringPct, TexTriFilteringCount, NoTexTriFilteringCount, TexVolFilteringPct, TexVolFilteringCount, NoTexVolFilteringCount
44+
* New counters added:
45+
* MemoryCache: L0TagConflictReadStalledCycles, L0TagConflictWriteStalledCycles, L0TagConflictAtomicStalledCycles
46+
* Changed to Visual Studio 2022 as the default build environment on Windows (previously Visual Studio 2019).
47+
* Added improved support for multi-GPU systems.
48+
* Removed code related to software counters on non-AMD hardware.
4049

4150
## System Requirements
4251
* An AMD Radeon GPU or APU based on Graphics IP version 8 and newer.

ReleaseNotes.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,21 @@
11
# GPU Performance API Release Notes
22
---
3+
## Version 3.14 (09/21/2023)
4+
* Added support for AMD Radeon RX 7700 XT and AMD Radeon RX 7800 XT graphics cards.
5+
* Added support for additional AMD Radeon 700M Series devices.
6+
* Added counters back to Gfx9, Gfx10, Gfx103, and Gfx11 hardware generations. These restored counters are listed below by group:
7+
* Timing: TessellatorBusy, TessellatorBusyCycles, VsGsBusy, VsGsBusyCycles, VsGsTime, PreTessellationBusy, PreTessellationBusyCycles, PreTessellationTime, PostTessellationBusy, PostTessellationBusyCycles, PostTessellationTime
8+
* VertexGeometry: VsGsVerticesIn, VsGsPrimsIn, GSVerticesOut
9+
* PreTessellation: PreTessVerticesIn
10+
* PostTessellation: PostTessPrimsOut
11+
* PrimitiveAssembly: PrimitivesIn
12+
* TextureUnit: TexTriFilteringPct, TexTriFilteringCount, NoTexTriFilteringCount, TexVolFilteringPct, TexVolFilteringCount, NoTexVolFilteringCount
13+
* New counters added:
14+
* MemoryCache: L0TagConflictReadStalledCycles, L0TagConflictWriteStalledCycles, L0TagConflictAtomicStalledCycles
15+
* Changed to Visual Studio 2022 as the default build environment on Windows (previously Visual Studio 2019).
16+
* Added improved support for multi-GPU systems.
17+
* Removed code related to software counters on non-AMD hardware.
18+
319
## Version 3.13.1 (06/22/2023)
420
* Add support for additional AMD Radeon RX 7000 Series hardware.
521
* Add support for AMD Radeon 700M Series APUs.

build/cmake_modules/build_flags.cmake

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
## Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All rights reserved.
1+
## Copyright (c) 2018-2023 Advanced Micro Devices, Inc. All rights reserved.
22
cmake_minimum_required(VERSION 3.5.1)
33

44
## GPA has only Debug and Release

build/cmake_modules/common.cmake

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
## Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All rights reserved.
1+
## Copyright (c) 2018-2023 Advanced Micro Devices, Inc. All rights reserved.
22
cmake_minimum_required(VERSION 3.5.1)
33

44
include (${GPA_CMAKE_MODULES_DIR}/utils.cmake)

build/cmake_modules/defs.cmake

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
1-
## Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All rights reserved.
1+
## Copyright (c) 2018-2023 Advanced Micro Devices, Inc. All rights reserved.
22
cmake_minimum_required(VERSION 3.19)
33

44
## Define the GPA version
55
set(GPA_MAJOR_VERSION 3)
6-
set(GPA_MINOR_VERSION 13)
7-
set(GPA_UPDATE_VERSION 1)
6+
set(GPA_MINOR_VERSION 14)
7+
set(GPA_UPDATE_VERSION 0)
88

99
if(NOT DEFINED GPA_BUILD_NUMBER)
1010
set(GPA_BUILD_NUMBER 0)

docs/doxygen/DoxyfilePublic

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ PROJECT_NAME = "GPU Perf API"
3131
# This could be handy for archiving the generated documentation or
3232
# if some version control system is used.
3333

34-
PROJECT_NUMBER = 3.13
34+
PROJECT_NUMBER = 3.14
3535

3636
# The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute)
3737
# base path where the generated documentation will be put.

docs/sphinx/source/compute_counter_tables_gfx10.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.. Copyright(c) 2018-2023 Advanced Micro Devices, Inc.All rights reserved.
1+
.. Copyright(c) 2018-2023 Advanced Micro Devices, Inc. All rights reserved.
22
.. Compute Performance Counters for RDNA
33
44
.. *** Note, this is an auto-generated file. Do not edit. Execute PublicCounterCompiler to rebuild.

0 commit comments

Comments
 (0)