Skip to content

Commit f836498

Browse files
Preliminary support for NVIDIA Jetson boards
NVIDIA Jetson device is an insdustrial Linux based embedded aarch64 platfrom with powerful builtin GPU, which is used for AI tasks, mostly for CV purposes. The support is provided via --enable-nvidia-jetson switch in the configure script. All the source code related to the NVIDIA Jetson is placed in the linux/NvidiaJetson.{h,c} source files and hidden by 'NVIDIA_JETSON' C preprocessor define. So, for x86_64 platforms the source code stays unchanged. Additional functionality added by this commit: 1. Fix for the CPU temperature reading. The Jetson device is not supported by libsensors. The CPU has 8 cores with only one CPU temperature sensor for all of them located in the thermal zone file. libsensor might be compiled in or turned off. The additional care was taken to provide successfull build with/without libsensors. 2. The Jetson GPU Meter was added: current load, frequency and temperature. == Technical details == The code tries to find out the correct sensors during the application startup. As an example, the sensors location for NVIDIA Jetson Orin are the following: - CPU temperature: /sys/devices/virtual/thermal/thermal_zone0/type - GPU temperature: /sys/devices/virtual/thermal/thermal_zone1/type - GPU frequency: /sys/class/devfreq/17000000.gpu/cur_freq - GPU curr load: /sys/class/devfreq/17000000.gpu/device/load Measure: - The GPU frequency is provided in Hz, shown in MHz. - The CPU/GPU temperatures are provided in Celsius multipled by 1000 (milli Celsius), shown in Cesius P.S. The GUI shows all temperatures for NVIDIA Jetson with additional precision comparing to the default x86_64 platform. If htop starts with root privileges (effective user id is 0), the experimental code activates. It reads the fixed sysfs file /sys/kernel/debug/nvmap/iovmm/clients with the following content, e.g.: ``` CLIENT PROCESS PID SIZE user gpu_burn 7979 23525644K user gnome_shell 8119 5800K user Xorg 2651 17876K total 23549320K ``` Unfortunately, the /sys/kernel/debug/* files are allowed to read only for the root user, that's why the restriction applies. The patch also adds a separate field 'GPU_MEM', which reads data from the added LinuxProcess::gpu_mem field. The field stores memory allocated for GPU in kilobytes. It is populated by the function NvidiaJetson_LoadGpuProcessTable (the implementation is located in NvidiaJetson.c), which is called at the end of the function Machine_scanTables. Additionally, the new Action is added: actionToggleGpuFilter, which is activated by 'g' hot key (the help is updated appropriately). The GpuFilter shows only the processes which currently utilize GPU (i.e. highly extended nvmap/iovmm/clients table). It is achieved by the filtering machinery associated with ProcessTable::pidMatchList. The code below constructs GPU_PID_MATCH_LIST hash table, then actionToggleGpuFilter either stores it to the ProcessTable::pidMatchList or restores old value of ProcessTable::pidMatchList. The separate LinuxProcess's PROCESS_FLAG_LINUX_GPU_JETSON (or something ...) flag isn't added for GPU_MEM, because currently the functionality of population LinuxProcess::gpu_mem is shared with the GPU consumers filter construction. So, even if GPU_MEM field is not activated, the filter showing GPU consumers should work. This kind of architecture is chosen intentially since it saves memory for the hash table GPU_PID_MATCH_LIST (which is now actually a set), and therefore increases performance. All other approaches convert GPU_PID_MATCH_LIST to a true key/value storage (key = pid, value = gpu memory allocated) with further merge code. == NVIDIA Jetson models == Tested for NVIDIA Jetson Orin and Xavier boards.
1 parent 7720dbd commit f836498

19 files changed

+498
-9
lines changed

Action.c

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -637,6 +637,29 @@ static Htop_Reaction actionTogglePauseUpdate(State* st) {
637637
return HTOP_REFRESH | HTOP_REDRAW_BAR | HTOP_KEEP_FOLLOWING;
638638
}
639639

640+
#ifdef NVIDIA_JETSON
641+
#include "NvidiaJetson.h"
642+
#include "ProcessTable.h"
643+
644+
static Htop_Reaction actionToggleGpuFilter(State* st) {
645+
static Hashtable *stash = NULL;
646+
647+
Hashtable *GpuPidMatchList = NvidiaJetson_GetPidMatchList();
648+
if (GpuPidMatchList) {
649+
st->showGpuProcesses = !st->showGpuProcesses;
650+
651+
ProcessTable *pt = (ProcessTable *)st->host->activeTable;
652+
if (st->showGpuProcesses) {
653+
stash = pt->pidMatchList;
654+
pt->pidMatchList = GpuPidMatchList;
655+
} else {
656+
pt->pidMatchList = stash;
657+
}
658+
}
659+
return HTOP_REFRESH | HTOP_REDRAW_BAR | HTOP_KEEP_FOLLOWING;
660+
}
661+
#endif
662+
640663
static const struct {
641664
const char* key;
642665
bool roInactive;
@@ -649,6 +672,7 @@ static const struct {
649672
{ .key = " F3 /: ", .roInactive = false, .info = "incremental name search" },
650673
{ .key = " F4 \\: ", .roInactive = false, .info = "incremental name filtering" },
651674
{ .key = " F5 t: ", .roInactive = false, .info = "tree view" },
675+
{ .key = " g: ", .roInactive = false, .info = "show processes which use GPU" },
652676
{ .key = " p: ", .roInactive = false, .info = "toggle program path" },
653677
{ .key = " m: ", .roInactive = false, .info = "toggle merged command" },
654678
{ .key = " Z: ", .roInactive = false, .info = "pause/resume process updates" },
@@ -924,6 +948,9 @@ void Action_setBindings(Htop_Action* keys) {
924948
keys['a'] = actionSetAffinity;
925949
keys['c'] = actionTagAllChildren;
926950
keys['e'] = actionShowEnvScreen;
951+
#ifdef NVIDIA_JETSON
952+
keys['g'] = actionToggleGpuFilter;
953+
#endif
927954
keys['h'] = actionHelp;
928955
keys['k'] = actionKill;
929956
keys['l'] = actionLsof;

Action.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,9 @@ typedef struct State_ {
3939
bool pauseUpdate;
4040
bool hideSelection;
4141
bool hideMeters;
42+
#ifdef NVIDIA_JETSON
43+
bool showGpuProcesses;
44+
#endif
4245
} State;
4346

4447
static inline bool State_hideFunctionBar(const State* st) {

CPUMeter.c

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,17 +98,30 @@ static void CPUMeter_updateValues(Meter* this) {
9898
}
9999
}
100100

101+
#ifdef NVIDIA_JETSON
102+
if (settings->showCPUTemperature) {
103+
char c = 'C';
104+
double cpuTemperature = this->values[CPU_METER_TEMPERATURE];
105+
if (settings->degreeFahrenheit) {
106+
c = 'F';
107+
cpuTemperature = ConvCelsiusToFahrenheit(cpuTemperature);
108+
}
109+
/* snprintf correctly represents double NAN numbers as 'nan' */
110+
xSnprintf(cpuTemperatureBuffer, sizeof(cpuTemperatureBuffer), "%.1f%s%c", cpuTemperature, CRT_degreeSign, c);
111+
}
112+
#else
101113
#ifdef BUILD_WITH_CPU_TEMP
102114
if (settings->showCPUTemperature) {
103115
double cpuTemperature = this->values[CPU_METER_TEMPERATURE];
104116
if (isNaN(cpuTemperature))
105117
xSnprintf(cpuTemperatureBuffer, sizeof(cpuTemperatureBuffer), "N/A");
106118
else if (settings->degreeFahrenheit)
107-
xSnprintf(cpuTemperatureBuffer, sizeof(cpuTemperatureBuffer), "%3d%sF", (int)(cpuTemperature * 9 / 5 + 32), CRT_degreeSign);
119+
xSnprintf(cpuTemperatureBuffer, sizeof(cpuTemperatureBuffer), "%3d%sF", (int)(ConvCelsiusToFahrenheit(cpuTemperature)), CRT_degreeSign);
108120
else
109121
xSnprintf(cpuTemperatureBuffer, sizeof(cpuTemperatureBuffer), "%d%sC", (int)cpuTemperature, CRT_degreeSign);
110122
}
111123
#endif
124+
#endif
112125

113126
xSnprintf(this->txtBuffer, sizeof(this->txtBuffer), "%s%s%s%s%s",
114127
cpuUsageBuffer,

CRT.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -231,6 +231,9 @@ static int CRT_colorSchemes[LAST_COLORSCHEME][LAST_COLORELEMENT] = {
231231
[DYNAMIC_MAGENTA] = ColorPair(Magenta, Black),
232232
[DYNAMIC_YELLOW] = ColorPair(Yellow, Black),
233233
[DYNAMIC_WHITE] = ColorPair(White, Black),
234+
#ifdef NVIDIA_JETSON
235+
[GPU_FILTER] = A_BOLD | ColorPair(Red, Cyan),
236+
#endif
234237
},
235238
[COLORSCHEME_MONOCHROME] = {
236239
[RESET_COLOR] = A_NORMAL,

CRT.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -158,6 +158,9 @@ typedef enum ColorElements_ {
158158
DYNAMIC_MAGENTA,
159159
DYNAMIC_YELLOW,
160160
DYNAMIC_WHITE,
161+
#ifdef NVIDIA_JETSON
162+
GPU_FILTER,
163+
#endif
161164
LAST_COLORELEMENT
162165
} ColorElements;
163166

DisplayOptionsPanel.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -173,10 +173,10 @@ DisplayOptionsPanel* DisplayOptionsPanel_new(Settings* settings, ScreenManager*
173173
Panel_add(super, (Object*) CheckItem_newByRef("Also show CPU frequency", &(settings->showCPUFrequency)));
174174
#ifdef BUILD_WITH_CPU_TEMP
175175
Panel_add(super, (Object*) CheckItem_newByRef(
176-
#if defined(HTOP_LINUX)
177-
"Also show CPU temperature (requires libsensors)",
178-
#elif defined(HTOP_FREEBSD)
176+
#if defined(HTOP_FREEBSD) || defined(NVIDIA_JETSON)
179177
"Also show CPU temperature",
178+
#elif defined(HTOP_LINUX)
179+
"Also show CPU temperature (requires libsensors)",
180180
#else
181181
#error Unknown temperature implementation!
182182
#endif

Machine.c

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ Released under the GNU GPLv2+, see the COPYING file
66
in the source distribution for its full text.
77
*/
88

9+
#include "NvidiaJetson.h"
910
#include "config.h" // IWYU pragma: keep
1011

1112
#include "Machine.h"
@@ -128,4 +129,8 @@ void Machine_scanTables(Machine* this) {
128129
}
129130

130131
Row_setUidColumnWidth(this->maxUserId);
132+
133+
#ifdef NVIDIA_JETSON
134+
NvidiaJetson_LoadGpuProcessTable(this->activeTable->table);
135+
#endif
131136
}

MainPanel.c

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -195,6 +195,11 @@ static void MainPanel_drawFunctionBar(Panel* super, bool hideFunctionBar) {
195195
if (this->state->pauseUpdate) {
196196
FunctionBar_append("PAUSED", CRT_colors[PAUSED]);
197197
}
198+
#ifdef NVIDIA_JETSON
199+
if (this->state->showGpuProcesses) {
200+
FunctionBar_append("GPU", CRT_colors[GPU_FILTER]);
201+
}
202+
#endif
198203
}
199204

200205
static void MainPanel_printHeader(Panel* super) {

Makefile.am

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -215,6 +215,11 @@ linux_platform_sources = \
215215
zfs/ZfsArcMeter.c \
216216
zfs/ZfsCompressedArcMeter.c
217217

218+
if NVIDIA_JETSON
219+
linux_platform_headers += linux/NvidiaJetson.h
220+
linux_platform_sources += linux/NvidiaJetson.c
221+
endif
222+
218223
if HAVE_DELAYACCT
219224
linux_platform_headers += linux/LibNl.h
220225
linux_platform_sources += linux/LibNl.c

XUtils.h

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,13 @@ char* xStrndup(const char* str, size_t len);
119119

120120
ATTR_NONNULL ATTR_ACCESS3_W(2, 3)
121121
ssize_t xReadfile(const char* pathname, void* buffer, size_t count);
122+
#ifdef NVIDIA_JETSON /* uncomment if you need this functionality somewhere else */
123+
ATTR_NONNULL ATTR_ACCESS3_W(2, 3)
124+
static inline double xReadNumberFromFile(const char *pathname, char *buf, const size_t len) {
125+
ssize_t nread = xReadfile(pathname, buf, len);
126+
return nread > 0 ? strtod(buf, NULL) : NAN;
127+
}
128+
#endif
122129
ATTR_NONNULL ATTR_ACCESS3_W(3, 4)
123130
ssize_t xReadfileat(openat_arg_t dirfd, const char* pathname, void* buffer, size_t count);
124131

@@ -174,4 +181,9 @@ static inline int xDirfd(DIR* dirp) {
174181
return r;
175182
}
176183

184+
185+
static inline double ConvCelsiusToFahrenheit(const double celsius) {
186+
return celsius * 9 / 5 + 32;
187+
}
188+
177189
#endif

configure.ac

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1095,7 +1095,47 @@ case "$enable_sensors" in
10951095
AC_MSG_ERROR([bad value '$enable_sensors' for --enable-sensors])
10961096
;;
10971097
esac
1098-
if test "$enable_sensors" = yes || test "$my_htop_platform" = freebsd; then
1098+
1099+
AC_ARG_ENABLE([nvidia-jetson],
1100+
[AS_HELP_STRING([--enable-nvidia-jetson],
1101+
[enable nvidia jetson support @<:@default=check@:>@])],
1102+
[],
1103+
[enable_nvidia_jetson=check])
1104+
case "$enable_nvidia_jetson" in
1105+
no)
1106+
;;
1107+
check)
1108+
if test -f "/etc/nv_tegra_release"; then
1109+
if grep -q "BOARD" "/etc/nv_tegra_release"; then
1110+
enable_nvidia_jetson=yes
1111+
fi
1112+
fi
1113+
;;
1114+
yes)
1115+
if test -f "/etc/nv_tegra_release"; then
1116+
if grep -q "BOARD" "/etc/nv_tegra_release"; then
1117+
enable_nvidia_jetson=yes
1118+
else
1119+
enable_nvidia_jetson=no
1120+
fi
1121+
else
1122+
enable_nvidia_jetson=no
1123+
fi
1124+
;;
1125+
*)
1126+
AC_MSG_ERROR([bad value '$enable_nvidia_jetson' for --enable-nvidia-jetson])
1127+
;;
1128+
esac
1129+
1130+
if test "$enable_nvidia_jetson" = yes; then
1131+
AC_DEFINE([NVIDIA_JETSON], [1], [Detected correct NVIDIA Jetson board])
1132+
else
1133+
AC_MSG_NOTICE([This is not a NVIDIA Jetson board])
1134+
fi
1135+
1136+
AM_CONDITIONAL([NVIDIA_JETSON], [test "$enable_nvidia_jetson" = yes])
1137+
1138+
if test "$enable_sensors" = yes || test "$my_htop_platform" = freebsd || test "$enable_nvidia_jetson" = yes; then
10991139
AC_DEFINE([BUILD_WITH_CPU_TEMP], [1], [Define if CPU temperature option should be enabled.])
11001140
fi
11011141

@@ -1233,6 +1273,7 @@ AC_MSG_RESULT([
12331273
(Linux) delay accounting: $enable_delayacct
12341274
(Linux) sensors: $enable_sensors
12351275
(Linux) capabilities: $enable_capabilities
1276+
(Linux) nvidia-jeston $enable_nvidia_jetson
12361277
unicode: $enable_unicode
12371278
affinity: $enable_affinity
12381279
unwind: $enable_unwind

linux/LinuxMachine.c

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ in the source distribution for its full text.
3232
#include "UsersTable.h"
3333
#include "XUtils.h"
3434

35+
#include "linux/NvidiaJetson.h"
3536
#include "linux/Platform.h" // needed for GNU/hurd to get PATH_MAX // IWYU pragma: keep
3637

3738
#ifdef HAVE_SENSORS_SENSORS_H
@@ -736,15 +737,20 @@ void Machine_scan(Machine* super) {
736737
const Settings* settings = super->settings;
737738
if (settings->showCPUFrequency
738739
#ifdef HAVE_SENSORS_SENSORS_H
739-
|| settings->showCPUTemperature
740+
|| settings->showCPUTemperature /* TODO: looks like this line in the condition might be removed */
740741
#endif
741742
)
742743
LinuxMachine_scanCPUFrequency(this);
743744

745+
#ifdef NVIDIA_JETSON
746+
if (settings->showCPUTemperature)
747+
NvidiaJetson_getCPUTemperatures(this->cpuData, super->existingCPUs);
748+
#else
744749
#ifdef HAVE_SENSORS_SENSORS_H
745750
if (settings->showCPUTemperature)
746751
LibSensors_getCPUTemperatures(this->cpuData, super->existingCPUs, super->activeCPUs);
747752
#endif
753+
#endif
748754
}
749755

750756
Machine* Machine_new(UsersTable* usersTable, uid_t userId) {
@@ -787,6 +793,10 @@ Machine* Machine_new(UsersTable* usersTable, uid_t userId) {
787793
// Initialize CPU count
788794
LinuxMachine_updateCPUcount(this);
789795

796+
#ifdef NVIDIA_JETSON
797+
NvidiaJetson_FindSensors();
798+
#endif
799+
790800
#ifdef HAVE_SENSORS_SENSORS_H
791801
// Fetch CPU topology
792802
LinuxMachine_fetchCPUTopologyFromCPUinfo(this);

linux/LinuxMachine.h

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -46,9 +46,10 @@ typedef struct CPUData_ {
4646

4747
double frequency;
4848

49-
#ifdef HAVE_SENSORS_SENSORS_H
49+
#ifdef BUILD_WITH_CPU_TEMP
5050
double temperature;
51-
51+
#endif
52+
#ifdef HAVE_SENSORS_SENSORS_H
5253
int physicalID; /* different for each CPU socket */
5354
int coreID; /* same for hyperthreading */
5455
int ccdID; /* same for each AMD chiplet */

linux/LinuxProcess.c

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ Released under the GNU GPLv2+, see the COPYING file
66
in the source distribution for its full text.
77
*/
88

9+
#include "NvidiaJetson.h"
910
#include "config.h" // IWYU pragma: keep
1011

1112
#include "linux/LinuxProcess.h"
@@ -112,6 +113,9 @@ const ProcessFieldData Process_fields[LAST_PROCESSFIELD] = {
112113
#endif
113114
[GPU_TIME] = { .name = "GPU_TIME", .title = "GPU_TIME ", .description = "Total GPU time", .flags = PROCESS_FLAG_LINUX_GPU, .defaultSortDesc = true, },
114115
[GPU_PERCENT] = { .name = "GPU_PERCENT", .title = " GPU% ", .description = "Percentage of the GPU time the process used in the last sampling", .flags = PROCESS_FLAG_LINUX_GPU, .defaultSortDesc = true, },
116+
#ifdef NVIDIA_JETSON
117+
[GPU_MEM]= { .name = "GPU_MEM", .title = "GPU_M ", .description = "GPU memory allocated for the process", .flags = 0, .defaultSortDesc = true, },
118+
#endif
115119
};
116120

117121
Process* LinuxProcess_new(const Machine* host) {
@@ -362,6 +366,9 @@ static void LinuxProcess_rowWriteField(const Row* super, RichString* str, Proces
362366
xSnprintf(buffer, n, "N/A ");
363367
}
364368
break;
369+
#ifdef NVIDIA_JETSON
370+
case GPU_MEM: Row_printKBytes(str, lp->gpu_mem, coloring); return;
371+
#endif
365372
default:
366373
Process_writeField(this, str, field);
367374
return;
@@ -466,6 +473,10 @@ static int LinuxProcess_compareByKey(const Process* v1, const Process* v2, Proce
466473
return SPACESHIP_NUMBER(p1->gpu_time, p2->gpu_time);
467474
case ISCONTAINER:
468475
return SPACESHIP_NUMBER(v1->isRunningInContainer, v2->isRunningInContainer);
476+
#ifdef NVIDIA_JETSON
477+
case GPU_MEM:
478+
return SPACESHIP_NUMBER(p1->gpu_mem, p2->gpu_mem);
479+
#endif
469480
default:
470481
return Process_compareByKey_Base(v1, v2, key);
471482
}

linux/LinuxProcess.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,10 @@ typedef struct LinuxProcess_ {
115115
/* Activity of GPU: 0 if active, otherwise time of last scan in milliseconds */
116116
uint64_t gpu_activityMs;
117117

118+
#ifdef NVIDIA_JETSON
119+
uint64_t gpu_mem;
120+
#endif
121+
118122
/* Autogroup scheduling (CFS) information */
119123
long int autogroup_id;
120124
int autogroup_nice;

0 commit comments

Comments
 (0)