Skip to content

Commit 018ffc4

Browse files
committed
GPA 3.8 updates
1 parent 02c8257 commit 018ffc4

File tree

664 files changed

+697873
-698507
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

664 files changed

+697873
-698507
lines changed

BUILD.md

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ GPUPerfAPI no longer uses git submodules to reference dependent repositories. In
1111
order to clone/update any dependent repositories.
1212

1313
#### Prerequisites
14-
* Python, which can be installed from https://www.python.org/. Either Python 2.7 or 3.x should work.
14+
* Python 3.x, which can be installed from https://www.python.org/.
1515
* CMake 3.7.2 or newer
1616
* For Windows, this can be downloaded from https://cmake.org/download/
1717
* For Linux, this can be installed using: sudo apt-get install cmake
@@ -32,8 +32,8 @@ present on the system, this script will instead do a "git pull" on those reposit
3232
this script everytime you pull new changes from GPA repository.
3333
* NOTE: For GPA 3.3 or newer, if you are updating an existing clone of the GPA repo from a GPA release prior than 3.3, you will first need to delete the Common/Lib/Ext/GoogleTest directory. Starting with GPA 3.3, GPA is now using a fork of the official GoogleTest repo. Failure to remove this directory will lead to git errors when running [pre_build.py](build/pre_build.py) or [fetch_dependencies.py](scripts/fetch_dependencies.py).
3434
* This script will also download and execute the Vulkan� SDK installer.
35-
* On Windows, running the installer may require elevation. If you've previously installed the required Vulkan version, UpdateCommon will simply copy the files form the default installation location into the correct place into the GPUPerfAPI directory tree.
36-
* UpdateCommon is set up to install the version of the Vulkan SDK which was used during development. If you want to use a newer version of the SDK, the following file will need to be updated:
35+
* On Windows, running the installer may require elevation. If you've previously installed the required Vulkan version, fetch_dependencies.py will simply copy the files from the default installation location into the correct place into the GPUPerfAPI directory tree.
36+
* fetch_dependencies.py is set up to install the version of the Vulkan SDK which was used during development. If you want to use a newer version of the SDK, the following file will need to be updated:
3737
* [fetch_dependencies.py](scripts/fetch_dependencies.py)
3838
* By default the build will expect the Vulkan SDK to be found in a directory pointed to by the `VULKAN_SDK` environment variable. This environment variable is automatically set by the Windows SDK installer, but you may need to set it manually after running the Linux SDK installer. The Linux SDK includes a script called `setup-env.sh` to aid in setting this environment variable:
3939
* `source ~/VulkanSDK/1.0.68.0/setup-env.sh` (adjust path as necessary)
@@ -44,7 +44,6 @@ this script everytime you pull new changes from GPA repository.
4444
* `--config=[debug,release]`: Specify the config for which to generate makefiles. Default is both. A specific config can only be specified on Linux. On Windows, both configs are always supported by the generated VS solution and project files.
4545
* `--platform=[x86,x64]`: Specify the platform for which to generate build files. Default is both.
4646
* `--clean`: Deletes CMakeBuild directory and regenerates all build files from scratch
47-
* `--internal`: Generates build files to build the internal version of GPA
4847
* `--skipdx11`: Does not generate build files for DX11 version of GPA (Windows only)
4948
* `--skipdx12`: Does not generate build files for DX12 version of GPA (Windows only)
5049
* `--skipvulkan`: Does not generate build files for Vulkan version of GPA
@@ -65,12 +64,11 @@ this script everytime you pull new changes from GPA repository.
6564
##### Build Instructions
6665
* Load cmake_bld\x64\GPUPerfAPI.sln into Visual Studio to build the 64-bit version of GPA
6766
* Load cmake_bld\x86\GPUPerfAPI.sln into Visual Studio to build the 32-bit version of GPA
68-
* After a successful build, the GPUPerfAPI binaries can be found in `GPA\Output\$(Configuration)` (for example GPA\Output\Release)
67+
* After a successful build, the GPUPerfAPI binaries can be found in `gpu_performance_api\output\$(Configuration)` (for example gpu_performance_api\output\release)
6968

7069
#### Additional Information
7170
* The Windows projects each include a .rc file that embeds the VERSIONINFO resource into the final binary. Internally within AMD, a Jenkins build system will dynamically update
7271
the build number. The version and build numbers can be manually updated by modifying the [gpa_version.h](source/gpu_perf_api_common/gpa_version.h) file.
73-
* When building the internal version (using the --internal switch when calling [pre_build.py](build/pre_build.py), each binary filename will have a "-Internal" suffix (for example GPUPerfAPIDX11-x64-Internal.dll)
7472

7573
## Linux Build Information
7674

@@ -83,7 +81,7 @@ this script everytime you pull new changes from GPA repository.
8381
* Execute "make" in the cmake_bld/x64/release to build the 64-bit release version of GPA
8482
* Execute "make" in the cmake_bld/x86/debug to build the 32-bit debug version of GPA
8583
* Execute "make" in the cmake_bld/x86/release to build the 32-bit release version of GPA
86-
* After a successful build, the GPUPerfAPI binaries can be found in `GPA/Output/$(Configuration)` (for example GPA/Output/release)
84+
* After a successful build, the GPUPerfAPI binaries can be found in `gpu_performance_api/output/$(Configuration)` (for example gpu_performance_api/output/release)
8785
* When building the internal version, each binary filename will also have a "-Internal" suffix (for example libGPUPerfAPIGL-Internal.so)
8886

8987
## PublicCounterCompiler Tool

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Copyright (c) 2016-2020 Advanced Micro Devices, Inc. All rights reserved.
1+
Copyright (c) 2016-2021 Advanced Micro Devices, Inc. All rights reserved.
22

33
Permission is hereby granted, free of charge, to any person obtaining a copy
44
of this software and associated documentation files (the "Software"), to deal

README.md

Lines changed: 49 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,11 @@
11
# GPU Performance API
22
---
3-
4-
## Overview
5-
The GPU Performance API (GPUPerfAPI, or GPA) is a powerful library, providing access to GPU Performance Counters.
3+
The GPU Performance API (GPUPerfAPI, or GPA) is a powerful library which provides access to GPU Performance Counters.
64
It can help analyze the performance and execution characteristics of applications using a Radeon™ GPU. This library
7-
is used by [Radeon Compute Profiler](https://github.com/GPUOpen-Tools/RCP) and [CodeXL](https://github.com/GPUOpen-Tools/CodeXL)
8-
as well as several third-party tools.
9-
10-
## Downloads
11-
Prebuilt binaries can be downloaded from the Releases page: https://github.com/GPUOpen-Tools/GPA/releases
5+
is used by [Radeon GPU Profiler](https://github.com/GPUOpen-Tools/radeon_gpu_profiler) as well as several third-party tools.
126

137
## Table of Contents
8+
* [Downloads](#downloads)
149
* [Major Features](#major-features)
1510
* [What's New](#whats-new)
1611
* [System Requirements](#system-requirements)
@@ -24,6 +19,9 @@ Prebuilt binaries can be downloaded from the Releases page: https://github.com/G
2419
* [Historical Release Notes](ReleaseNotes.md)
2520
* [Style and Format Change](#Style-and-Format-Change)
2621

22+
## Downloads
23+
Prebuilt binaries can be downloaded from the Releases page: https://github.com/GPUOpen-Tools/gpu_performance_api/releases.
24+
2725
## Major Features
2826
* Provides a standard API for accessing GPU Performance counters for both graphics and compute workloads across multiple GPU APIs.
2927
* Supports Vulkan™, DirectX™ 12, DirectX 11, OpenGL, and OpenCL™.
@@ -33,24 +31,21 @@ Prebuilt binaries can be downloaded from the Releases page: https://github.com/G
3331
* Provides access to some raw hardware counters. See [Raw Hardware Counters](#raw-hardware-counters) for more information.
3432

3533
## What's New
36-
* Version 3.7 (11/24/20)
37-
* Add support for additional GPUs and APUs, including AMD RDNA™ 2 Radeon™ RX 6000 series GPUs.
38-
* New RT counters for DXR workloads on AMD RDNA™ 2 Radeon™ RX 6000 series GPUs.
39-
* RayTriTests, and RayBoxTests: These counters collect the number of ray intersections for triangles and boxes, respectively.
40-
* TotalRayTests: This counter collects the aggregated number of ray-box and ray-triangle intersection tests.
41-
* RayTestsPerWave: This counter collects ray intersection test count at a more granular level – per wave.
42-
* New Scalar and Instruction cache counters on AMD RDNA™ Radeon™ RX 5000 series GPUs.
43-
* Scalar cache: ScalarCacheHit, ScalarCacheRequestCount, ScalarCacheHitCount, ScalarCacheMissCount
44-
* Instruction cache: InstCacheHit, InstCacheRequestCount, InstCacheHitCount, InstCacheMissCount
45-
* Update the Vulkan® sample to remove the static link and use the system-specific Vulkan® loader.
46-
* Remove OpenCL™ support from Linux.
47-
* Remove downloading the Vulkan® SDK by the build script.
34+
* Version 3.8 (04/01/21)
35+
* Add support for additional GPUs and APUs, including AMD Radeon™ RX 6700 series GPUs.
36+
* Code has been updated to adhere to Google C++ Style Guide.
37+
* New public headers have been added.
38+
* Old headers are deprecated and will emit compile-time message.
39+
* Projects loading GPA will need to be recompiled, but no code changes are required unless moving to the new headers.
40+
* Improvements made to sample applications.
41+
* Updated documentation for new codestyle (and https://github.com/GPUOpen-Tools/gpu_performance_api/issues/56)
42+
* Support for the --internal flag to has been removed from the build script.
4843

4944
## System Requirements
5045
* An AMD Radeon GPU or APU based on Graphics IP version 8 and newer.
5146
* Windows: Radeon Software Adrenaline 2020 Edition 20.11.2 or later (Driver Packaging Version 20.45 or later).
5247
* Linux: Radeon Software for Linux Revision 20.45 or later.
53-
* Radeon GPUs or APUs based on Graphics IP version 6 and 7 are no longer supported by GPUPerfAPI. Please use an older version ([3.3](https://github.com/GPUOpen-Tools/GPA/releases/tag/v3.3)) with older hardware.
48+
* Radeon GPUs or APUs based on Graphics IP version 6 and 7 are no longer supported by GPUPerfAPI. Please use an older version ([3.3](https://github.com/GPUOpen-Tools/gpu_performance_api/releases/tag/v3.3)) with older hardware.
5449
* Windows 7, 8.1, and 10.
5550
* Ubuntu (16.04 and later) and CentOS/RHEL (7 and later) distributions.
5651

@@ -59,12 +54,12 @@ To clone the GPA repository, execute the following git command
5954
* git clone https://github.com/GPUOpen-Tools/gpu_performance_api
6055

6156
After cloning the repository, please run the following python script to retrieve the required dependencies and generate the build files (see [BUILD.md](BUILD.md) for more information):
62-
* python PreBuild.py
57+
* python pre_build.py
6358

6459
## Source Code Directory Layout
6560
* [build](build) -- contains build scripts and cmake build modules
6661
* [docs](docs) -- contains sphinx documentation sources and a Doxygen configuration file
67-
* [include](include) -- contains GPUPerfAPI public headers
62+
* [include/gpu_performance_api](include/gpu_performance_api) -- contains GPUPerfAPI public headers
6863
* [scripts](scripts) -- scripts to use to clone/update dependent repositories
6964
* [source/auto_generated](source/auto_generated) -- contains auto-generated source code used when building GPUPerfAPI
7065
* [source/examples](source/examples) -- contains the source code for a DirectX 12, DirectX 11, Vulkan and OpenGL sample which use GPUPerfAPI
@@ -89,31 +84,39 @@ The documentation is hosted publicly at: http://gpuperfapi.readthedocs.io/en/lat
8984

9085
## Raw Hardware Counters
9186
This release exposes both "Derived" counters and "Raw Hardware" counters. Derived counters are counters that are computed using a set of raw hardware counters.
92-
While querying raw hardware counters was possible in earlier GPUPerfAPI releases, the current release makes it much simpler. In previous releases, you had to build
93-
GPUPerfAPI with special build flags in order to produce an "Internal" version that exposed the raw hardware counters. Current versions allow you to access the raw
94-
hardware counters in a default build, by simply specifying a new flag when calling GPA_OpenContext. The current CMake build system still allows you to produce an "Internal"
95-
build of GPUPerfAPI that also exposes the raw hardware counters, but that is a deprecated build and it is likely to be removed in a future release.
87+
This version allows you to access the raw hardware counters by simply specifying a flag when calling GpaOpenContext.
9688

9789
## Known Issues
98-
* On Ubuntu 20.04 LTS, Vulkan ICD may not be set to use AMD Vulkan ICD. In this case, it needs to be explicitly set to use AMD Vulkan ICD before using the GPA.
99-
It can be done by setting the "VK_ICD_FILENAMES" environment variable to "/etc/vulkan/icd.d/amd_icd64.json"
100-
* VSVerticesIn, HSPatches, and DSVerticesIn counters aren't availbale on Radeon RX 6000 Series GPU using OpenGL.
101-
* FetchSize counter will show an error when enabled on Radeon RX 6000 Series GPU using OpenGL. This is expected to be fixed in a future driver release.
102-
* Adjusting the GPU clock mode on Linux is accomplished by writing to <br><br>/sys/class/drm/card\<N\>/device/power_dpm_force_performance_level<br><br> where \<N\> is
103-
the index of the card in question. By default this file is only modifiable by root, so the application being profiled would have to be run as root in order for it to
104-
modify the clock mode. It is possible to modify the permissions for the file instead so that it can be written by unprivileged users. The following command will
105-
achieve this. Note, however, that changing the permissions on a system file like this could circumvent security. Also, on multi-GPU systems, you may have to replace
106-
"card0" with the appropriate card number. Permissions on this file may be reset when rebooting the system:
107-
* sudo chmod ugo+w /sys/class/drm/card0/device/power_dpm_force_performance_level
108-
* The following performance counter values may not be accurate for DirectX 11 applications running on a Radeon 5700, and 6000 Series GPU. This is expected to be fixed in a future driver release.
109-
* VALUInstCount, SALUInstCount, VALUBusy, SALUBusy for all shader stages: These values should be representative of performance, but may not be 100% accurate.
110-
* Most of the ComputeShader counters (all except the MemUnit and WriteUnit counters): These values should be representative of performance, but may not be 100% accurate.
111-
* The following performance counter values may not be accurate for OpenGL applications running on a Radeon 5700 Series GPU. This is expected to be addressed in a future
112-
driver release:
113-
* VALUInstCount, SALUInstCount, VALUBusy, SALUBusy for all shader stages: These values should be representative of performance, but may not be 100% accurate.
114-
* Most of the ComputeShader counters (all except the MemUnit and WriteUnit counters): These values should be representative of performance, but may not be 100% accurate.
115-
* On Linux, setting the GPU clock mode is not working correctly for Radeon 5700 Series GPUs, potentially leading to some inconsistencies in counter values from one run to the
116-
next. This is expected to be addressed in a future driver release.
90+
### Ubuntu 20.04 LTS Vulkan ICD Issue
91+
On Ubuntu 20.04 LTS, Vulkan ICD may not be set to use AMD Vulkan ICD. In this case, it needs to be explicitly set to use AMD Vulkan ICD before using the GPA. It can be done by setting the ```VK_ICD_FILENAMES``` environment variable to ```/etc/vulkan/icd.d/amd_icd64.json```.
92+
93+
### OpenGL Fetchsize Counter on Radeon RX 6000
94+
FetchSize counter will show an error when enabled on Radeon RX 6000 Series GPU using OpenGL.
95+
96+
### Adjusting Linux Clock Mode
97+
Adjusting the GPU clock mode on Linux is accomplished by writing to: ```/sys/class/drm/card\<N\>/device/power_dpm_force_performance_level```, where \<N\> is the index of the card in question.
98+
99+
By default this file is only modifiable by root, so the application being profiled would have to be run as root in order for it to modify the clock mode. It is possible to modify the permissions for the file instead so that it can be written by unprivileged users. The following command will achieve this: ```sudo chmod ugo+w /sys/class/drm/card0/device/power_dpm_force_performance_level```
100+
* Note that changing the permissions on a system file like this could circumvent security.
101+
* On multi-GPU systems you may have to replace "card0" with the appropriate card number.
102+
* You may have to reboot the system for the change to take effect.
103+
* Setting the GPU clock mode is not working correctly for <b>Radeon 5700 Series GPUs</b>, potentially leading to some inconsistencies in counter values from one run to the next.
104+
105+
### DirectX11 Performance Counter Accuracy For Select Counters and GPUs
106+
The following performance counter values may not be accurate for DirectX 11 applications running on a Radeon 5700, and 6000 Series GPU.
107+
* VALUInstCount, SALUInstCount, VALUBusy, SALUBusy for all shader stages: These values should be representative of performance, but may not be 100% accurate.
108+
* Most of the ComputeShader counters (all except the MemUnit and WriteUnit counters): These values should be representative of performance, but may not be 100% accurate.
109+
110+
### OpenGL Performance Counter Accuracy For Radeon 7500
111+
The following performance counter values may not be accurate for OpenGL applications running on a Radeon 5700 Series GPU.
112+
* Most of the ComputeShader counters (all except the MemUnit and WriteUnit counters): These values should be representative of performance, but may not be 100% accurate.
113+
114+
### Variability in Deterministic Counters For Select GPUs
115+
Performance counters which should be deterministic are showing variability on Radeon 5700 and 6000 Series GPUs. The values should be useful for performance analysis, but may not be 100% correct.
116+
* e.g. VSVerticesIn, PrimitivesIn, PSPixelsOut, PreZSamplesPassing
117+
118+
### Profiling Bundles
119+
Profiling bundles in DirectX12 and Vulkan is not working properly. It is recommended to remove those GPA Samples from your application, or move the calls out of the bundle for profiling.
117120

118121
## Style and Format Change
119122

0 commit comments

Comments
 (0)