Skip to content

tests: add sleep to catch reload timeout in clock_handler#1256

Open
ibujna wants to merge 1 commit intomemcached:nextfrom
ibujna:bugfix/fix_test_in_proxy_reload_timeout
Open

tests: add sleep to catch reload timeout in clock_handler#1256
ibujna wants to merge 1 commit intomemcached:nextfrom
ibujna:bugfix/fix_test_in_proxy_reload_timeout

Conversation

@ibujna
Copy link
Contributor

@ibujna ibujna commented Jan 16, 2026

The reload timeout in clock_handler() is set to 1 second. Add a sleep to reliably trigger the event timeout during tests.

The reload timeout in clock_handler() is set to 1 second.
Add a sleep to reliably trigger the event timeout during tests.
@ibujna ibujna changed the base branch from master to next January 16, 2026 09:24
@dormando
Copy link
Member

I'm not clear on this one either. The watcher should be waiting for the log lines to come in. What is this fixing, exactly?

@ibujna
Copy link
Contributor Author

ibujna commented Jan 27, 2026

The libevent call for the proxy reload functionality is configured in the clock_handler()

struct timeval t = {.tv_sec = 1, .tv_usec = 0};
function with a 1-second interval.
On some systems (tested for example on macOS Tahoe), there is a situation where the reload waits for this 1-second libevent event, but in the meantime the test receives the log entry
"ts=12212211 gid=1000 type=proxy_conf status=start" from the socket.
At minimum, a 1-second sleep should be added here. Same as setup in clock_handler().
This happens randomly about for example once every ten runs.

@dormando
Copy link
Member

dormando commented Jan 27, 2026

@ibujna what's the actual test output when it fails?

I'm fairly sure I designed this test so that it just blocks until the 1 second timer clicks over, so I'm curious what log line you're getting that causes it to fail.

IIRC there's at least one other test where it'll wait for a while looking for a specific log line (edit: while ignoring anything else), and this test probably just needs that code.

@dormando
Copy link
Member

dormando commented Mar 4, 2026

@ibujna Poking for an update; would like to see the actual output of the test error so I can understand this better.

I will also be fixing my macos CI and running it myself though...

@ibujna
Copy link
Contributor Author

ibujna commented Mar 5, 2026

So finally reproduce it on M1 Mac OS Tahoe 26.3.
In one terminal start this test perl t/proxyunits.t and on second terminal run this test in same time perl t/proxyustats.t

I get this fail output

ok 3 - succeeded to allocate 1024 ustats
! Subtest: failed to allocate 1025 ustats
ERROR: Failed to execute mcp_config_pools: stat index must be 1024 or less
ts=1772750001.21195 gid=4 type=proxy_conf status=start
not ok 1 - reload not started
! Failed test 'reload not started'
! at t/proxyustats.t line 116.
! 'ts=1772750001.21195 gid=4 type=proxy_conf status=start
! '
! matches '(?^:ts=(\S+) gid=\d+ type=proxy_conf status=start)'
ok 2 - reload not completed
ok 3 - watcher enabled
1..3
! Looks like you failed 1 test of 3.
not ok 4 - failed to allocate 1025 ustats
! Failed test 'failed to allocate 1025 ustats'
! at t/proxyustats.t line 120.
! Subtest: failed to allocate ustat at index 0
ERROR: Failed to execute mcp_config_pools: stat index must be 1 or higher
Use of uninitialized value $s in string at t/proxyustats.t line 129, line 1.
Warning: something's wrong at t/proxyustats.t line 129, line 1.
ok 1 - reload not started
ok 2 - reload not completed
ok 3 - watcher enabled
1..3
ok 5 - failed to allocate ustat at index 0
1..5

This is my patch to see string from proxy
proxyustats_debug.patch

@dormando
Copy link
Member

dormando commented Mar 5, 2026

Can you use prove t/proxyunits.t instead of raw perl? It wraps the TAP output to be legible.

So it requires running both tests at the same time, but it's just the ustats one that fails in a weird way?

@ibujna
Copy link
Contributor Author

ibujna commented Mar 6, 2026

I can't now reproduce it now again. This issue occurs only occasionally.

This is text output of perl test error
test.txt

When I debugged and when this situation occurs, libevent is triggered after 1s setup here so there is 1s delay in reloading.

@dormando
Copy link
Member

dormando commented Mar 6, 2026

The output doesn't make sense to me. I know that it can't fail unless it's getting a different log line, but your test output shows the correct log line. There's no way there's a timing delay issue here unless it results in the wrong log line being printed.

I'll come back to this later. should just write a set of MemcachedTest.pm subs for dealing with watcher logs. In other places I have it in a loop looking for the correct log line for some amount of time, which would probably do it here.

@ibujna
Copy link
Contributor Author

ibujna commented Mar 11, 2026

You testing unlike of this ts=1772750001.21195 gid=4 type=proxy_conf status=start

unlike(<$watcher>, qr/ts=(\S+) gid=\d+ type=proxy_conf status=start/, "reload not started");

So proxy still not reloading and waiting to this code occured
struct timeval t = {.tv_sec = 1, .tv_usec = 0};

And still return proper status. There is 1s timeout.

@dormando
Copy link
Member

I don't think there's a timeout on reading the $watcher filehandle. It's supposed to block forever?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants