-
Notifications
You must be signed in to change notification settings - Fork 58
Open
Description
RGA 2.11, win11
kernel.cl:
__kernel __attribute__((reqd_work_group_size(8, 8, 1))) void foo(__global uchar *buf) {
uint a = get_global_id(0);
uint b = get_global_id(1);
buf[a] = a - b;
}Compiled as
rga -s opencl -c gfx1030 --O3 --livereg-sgpr . kernel.cl
Analysis result:
Line | Rn | | Instruction
--------------------------------------------------------------------------------
1 | 3 | ::: | s_clause 0x1
2 | 5 | ^^^^vv: | s_load_dwordx4 s[0:3], s[4:5], 0x30
3 | 5 | : : xx: | s_load_dwordx2 s[4:5], s[4:5], null
4 | 5 | : : ::v | v_lshl_or_b32 v0, s6, 3, v0
5 | 5 | : : :: ^ | s_mulk_i32 s7, 0xf8
6 | 5 | : : :: : | s_waitcnt lgkmcnt(0)
7 | 5 | : v :: : | v_add_nc_u32 v1, s2, v1
8 | 4 | v :: : | v_add_nc_u32 v0, s0, v0
9 | 3 | :: v | v_sub_nc_u32 v1, s7, v1
10 | 2 | :: | v_add_nc_u32 v2, v1, v0
11 | 3 | ^ v: | v_add_co_u32 v0, s0, s4, v0
12 | 2 | v v | v_add_co_ci_u32 v1, null, s5, 0, s0
13 | 0 | | global_store_byte v[0:1], v2, off
14 | 0 | | s_endpgm
At the start of the kernel register s7 contains get_group_id(1) and in the 5th instruction it gets both read and written (s_mulk_i32's semantic is D = D * simm16), so the correct analysis should actually start with
Line | Rn | | Instruction
--------------------------------------------------------------------------------
1 | 3 | :::: | s_clause 0x1
2 | 5 | ^^^^vv:: | s_load_dwordx4 s[0:3], s[4:5], 0x30
3 | 5 | : : xx:: | s_load_dwordx2 s[4:5], s[4:5], null
4 | 5 | : : ::v: | v_lshl_or_b32 v0, s6, 3, v0
5 | 5 | : : :: x | s_mulk_i32 s7, 0xf8
...
Metadata
Metadata
Assignees
Labels
No labels