Skip to content

Use size_t for Process offset values #1588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 5, 2025

Conversation

Explorer09
Copy link
Contributor

Convert cmdlineBasenameStart, cmdlineBasenameEnd and procExeBasenameOffset values in Process to size_t type. Improve functions that search for "basenames", "comm" and "procExe" to use size_t for offsets as well.

I made these set of code changes before the length properties of RichString object can be upgraded to size_t.

Note: This PR contains many platform specific code changes. I didn't test all of them. It is possible that I broke something.

@Explorer09 Explorer09 force-pushed the process-offsets-size-type branch from 454cf1e to 87ab00e Compare January 15, 2025 21:17
Process.c Outdated
@@ -1057,7 +1057,7 @@ void Process_updateExe(Process* this, const char* exe) {
if (exe) {
this->procExe = xStrdup(exe);
const char* lastSlash = strrchr(exe, '/');
this->procExeBasenameOffset = (lastSlash && *(lastSlash + 1) != '\0' && lastSlash != exe) ? (size_t)(lastSlash - exe + 1) : 0;
this->procExeBasenameOffset = (lastSlash > exe && *(lastSlash + 1) != '\0') ? (size_t)(lastSlash - exe + 1) : 0;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: I have no idea why this conditional needs a lastSlash != exe. I thought it should be legitimate when lastSlash == exe, that is, when the exe string begins with a slash such as "/vmlinuz".

@BenBE BenBE requested review from natoscott and BenBE January 16, 2025 06:53
@BenBE BenBE added enhancement Extension or improvement to existing feature code quality ♻️ Code quality enhancement Linux 🐧 Linux related issues FreeBSD 👹 FreeBSD related issues MacOS 🍏 MacOS / Darwin related issues BSD 🐡 Issues related to *BSD PCP PCP related issues NetBSD 🎏 NetBSD related issues OpenBSD 🐡 OpenBSD related issues DragonflyBSD 🪰 DragonflyBSD related issues labels Jan 16, 2025
@Explorer09 Explorer09 marked this pull request as draft January 16, 2025 11:41
@Explorer09
Copy link
Contributor Author

Explorer09 commented Jan 16, 2025

Well, it looks like the changes to size_t do create an undefined behavior.

The commStart and commEnd variables in Process_makeCommandStr(), they can become negative during the stripExeFromCmdline code path.

I think the logic there needs to be revised. While I could workaround by using ssize_t for commStart and commEnd, that doesn't look like a long term solution.

I would try to make a commit that properly fixes that.

EDIT: This change has been done.

@Explorer09 Explorer09 force-pushed the process-offsets-size-type branch 2 times, most recently from c0c9a0b to 707cde1 Compare January 16, 2025 22:44
@BenBE
Copy link
Member

BenBE commented Jan 16, 2025

FYI; the last time I needed to work on this logic, it took me the better of 3 months to track down some issue with the command line handling in this code. So be careful with your changes; the details are important with this part of the code …

@Explorer09 Explorer09 marked this pull request as ready for review January 16, 2025 23:39
@Explorer09
Copy link
Contributor Author

@BenBE
The changes of upgrading the length values of RichString to size_t is ready here:
https://github.com/Explorer09/htop-1/commits/rich-string-chlen/

But I'm not sure I should file a new PR for it or just merge with this one. The new changes depend on this PR and I've seen you added tags and categories to this PR already.

@BenBE
Copy link
Member

BenBE commented Jan 19, 2025

@Explorer09 Make a second PR as this makes discussing the changes easier. From a quick view at your commits on that branch I already noticed some things I'd like to change with that patchset …

@Explorer09 Explorer09 force-pushed the process-offsets-size-type branch from 707cde1 to 473c847 Compare January 23, 2025 22:04
@Explorer09 Explorer09 marked this pull request as draft February 2, 2025 07:45
@Explorer09 Explorer09 force-pushed the process-offsets-size-type branch 2 times, most recently from 8b874d5 to c256d1e Compare February 3, 2025 09:12
@Explorer09 Explorer09 marked this pull request as ready for review February 3, 2025 09:45
Copy link
Member

@BenBE BenBE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take me some time to check consistency with all those edits. Especially with all the changes regarding edge cases re signed-nes changing due to this PR.

What might be worth doing is having the markers for "not used" become a special constant #define STRPOS_UNUSED ((size_t)-1).

@Explorer09 Explorer09 force-pushed the process-offsets-size-type branch from c256d1e to cc24137 Compare February 9, 2025 18:16
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 9 out of 9 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (1)

Process.c:136

  • Potential underflow in matchCmdlinePrefixWithExeSuffix: if cmdlineBaseOffset is 0, subtracting 1 may underflow. Consider adding a guard check to ensure cmdlineBaseOffset is non-zero before subtracting.
for (i = cmdlineBaseOffset - 1, j = exeBaseOffset - 1; i != (size_t)-1 && j != (size_t)-1 && cmdline[i] == exe[j]; --i, --j)

@BenBE
Copy link
Member

BenBE commented Apr 16, 2025

A quick glance at the indicated code location confirms this underflow to be reached. The handling seems to be fine, but a cleaner solution would be to properly catch this before reaching that for loop.

@Explorer09
Copy link
Contributor Author

A quick glance at the indicated code location confirms this underflow to be reached. The handling seems to be fine, but a cleaner solution would be to properly catch this before reaching that for loop.

It's already done, but in the last commit only ("Process: Use size_t for "cmdline" and "procExe" offsets"). Should I move that change to an earlier commit to silence the AI complaint?

@BenBE
Copy link
Member

BenBE commented Apr 16, 2025

The AI complaint is about the cmdlineBaseOffset - 1 and exeBaseOffset - 1, but the first one in particular. When you enter with cmdlineBaseOffset set to 0, this underflows and is what the AI remarked about. Yes, that's defined in the standard and the code handles this situation, but a somewhat cleaner code path here would be nice.

Ideally, your last commit would only have to update the type, with all preceding ones ensuring no negative numbers appear along the modified code paths.

@Explorer09
Copy link
Contributor Author

The AI complaint is about the cmdlineBaseOffset - 1 and exeBaseOffset - 1, but the first one in particular. When you enter with cmdlineBaseOffset set to 0, this underflows and is what the AI remarked about. Yes, that's defined in the standard and the code handles this situation, but a somewhat cleaner code path here would be nice.

Ideally, your last commit would only have to update the type, with all preceding ones ensuring no negative numbers appear along the modified code paths.

I replied here. In short, maybe it helps by incrementing the i and j counters by 1.

            for (size_t i = cmdlineBaseOffset, j = exeBaseOffset;
               i != 0 && j != 0 && cmdline[i - 1] == exe[j - 1];
               --i, --j)
               /* empty loop body */;
            if (i == 0 && j != 0 && exe[j - 1] == '/') {
               *cmdlineBasenameStart = cmdlineBaseOffset;
               return matchLen;
            }

@Explorer09 Explorer09 force-pushed the process-offsets-size-type branch 3 times, most recently from 7fdc2bd to 613c74b Compare April 22, 2025 09:36
@Explorer09 Explorer09 force-pushed the process-offsets-size-type branch 2 times, most recently from d5e6cdf to 805fc15 Compare April 28, 2025 12:37
@Explorer09 Explorer09 force-pushed the process-offsets-size-type branch 4 times, most recently from fb81c99 to db89e3e Compare May 15, 2025 21:13
@Explorer09 Explorer09 force-pushed the process-offsets-size-type branch 2 times, most recently from e625ca8 to c0bf160 Compare May 21, 2025 21:17
@Explorer09 Explorer09 force-pushed the process-offsets-size-type branch 3 times, most recently from 386c72a to 3303a96 Compare June 4, 2025 10:11
Reorder a conditional to prevent an unnecessary negative array offset
access. The compiled behavior should be exactly the same.

Signed-off-by: Kang-Che Sung <[email protected]>
No code changes. Fix an error by commit
94a52cb that accidentally replaced
a keyword in comment to "cmdlineBasenameEnd". According to the context
it should refer to "cmdlineBasenameStart" instead.
* Improve readability of local variables.
* Increment the loop counter values ("i" and "j") by 1 and adjust the
  loop termination conditions. This allow "i" and "j" to easily switch
  to an unsigned type in the future.

Signed-off-by: Kang-Che Sung <[email protected]>
Store the "cmdlineBasenameLen" and "commLen" local variables rather
than "cmdlineBasenameEnd" and "commEnd".

Signed-off-by: Kang-Che Sung <[email protected]>
Adjust logic in Process_makeCommandStr() so that if the process
"cmdline" basename has a length of zero, skip creating all of the
highlights of that basename.

Signed-off-by: Kang-Che Sung <[email protected]>
The matchCmdlinePrefixWithExeSuffix() internal function (in Process.c)
can match basenames in multiple tries in case the
"cmdlineBasenameStart" input value is unreliable. Take advantage of
this and update "cmdlineBasenameStart" with the new offset detected by
the function.

Also make the matching behavior consistent regardless of
"showMergedCommand" setting.

Signed-off-by: Kang-Che Sung <[email protected]>
Add a special case when the process "comm" string is found in
"cmdline", but at the starting basename position that would be stripped
by "stripExeFromCmdline" setting. This special case will now highlight
the basename of "exe" string as well as the first token of "cmdline"
(as the two strings are concatenated together when displaying).

Signed-off-by: Kang-Che Sung <[email protected]>
Convert the members "cmdlineBasenameStart", "cmdlineBasenameEnd" and
"procExeBasenameOffset" in "Process" to size_t type. Also upgrade many
routines that search for "basenames", COMM, and PROC_EXE string to use
size_t for iterators.

The "cmdlineBasenameEnd" value is no longer allowed to negative. It is
now set to 0 during Process_init().

Signed-off-by: Kang-Che Sung <[email protected]>
@Explorer09 Explorer09 force-pushed the process-offsets-size-type branch from 3303a96 to 656a2d2 Compare June 4, 2025 17:20
@BenBE BenBE merged commit e7a0182 into htop-dev:main Jun 5, 2025
19 checks passed
@Explorer09 Explorer09 deleted the process-offsets-size-type branch June 5, 2025 09:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BSD 🐡 Issues related to *BSD code quality ♻️ Code quality enhancement DragonflyBSD 🪰 DragonflyBSD related issues enhancement Extension or improvement to existing feature FreeBSD 👹 FreeBSD related issues Linux 🐧 Linux related issues MacOS 🍏 MacOS / Darwin related issues NetBSD 🎏 NetBSD related issues OpenBSD 🐡 OpenBSD related issues PCP PCP related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants