-
Notifications
You must be signed in to change notification settings - Fork 109
Description
Environment
prove
prove --version
TAP::Harness v3.43 and Perl v5.34.0
nginx
nginx -V
nginx version: openresty/1.25.3.1
built by gcc 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
built with OpenSSL 3.2.0 23 Nov 2023
TLS SNI support enabled
configure arguments: --prefix=/usr/local/openresty/nginx --with-debug --with-cc-opt='-DNGX_LUA_USE_ASSERT -DNGX_LUA_ABORT_AT_PANIC -O2 -DAPISIX_RUNTIME_VER=1.2.0 -DNGX_GRPC_CLI_ENGINE_PATH=/usr/local/openresty/libgrpc_engine.so -DNGX_HTTP_GRPC_CLI_ENGINE_PATH=/usr/local/openresty/libgrpc_engine.so -DNGX_LUA_ABORT_AT_PANIC -I/usr/local/openresty/zlib/include -I/usr/local/openresty/pcre/include -I/usr/local/openresty/openssl3/include' --add-module=../ngx_devel_kit-0.3.3 --add-module=../echo-nginx-module-0.63 --add-module=../xss-nginx-module-0.06 --add-module=../ngx_coolkit-0.2 --add-module=../set-misc-nginx-module-0.33 --add-module=../form-input-nginx-module-0.12 --add-module=../encrypted-session-nginx-module-0.09 --add-module=../srcache-nginx-module-0.33 --add-module=../ngx_lua-0.10.26 --add-module=../ngx_lua_upstream-0.07 --add-module=../headers-more-nginx-module-0.37 --add-module=../array-var-nginx-module-0.06 --add-module=../memc-nginx-module-0.20 --add-module=../redis2-nginx-module-0.15 --add-module=../redis-nginx-module-0.3.9 --add-module=../ngx_stream_lua-0.0.14 --with-ld-opt='-Wl,-rpath,/usr/local/openresty/luajit/lib -Wl,-rpath,/usr/local/openresty/wasmtime-c-api/lib -L/usr/local/openresty/zlib/lib -L/usr/local/openresty/pcre/lib -L/usr/local/openresty/openssl3/lib -Wl,-rpath,/usr/local/openresty/zlib/lib:/usr/local/openresty/pcre/lib:/usr/local/openresty/openssl3/lib' --add-module=/tmp/tmp.8fm9BEJ9Sy/openresty-1.25.3.1/../mod_dubbo-1.0.2 --add-module=/tmp/tmp.8fm9BEJ9Sy/openresty-1.25.3.1/../ngx_multi_upstream_module-1.2.0 --add-module=/tmp/tmp.8fm9BEJ9Sy/openresty-1.25.3.1/../apisix-nginx-module-1.16.0 --add-module=/tmp/tmp.8fm9BEJ9Sy/openresty-1.25.3.1/../apisix-nginx-module-1.16.0/src/stream --add-module=/tmp/tmp.8fm9BEJ9Sy/openresty-1.25.3.1/../apisix-nginx-module-1.16.0/src/meta --add-module=/tmp/tmp.8fm9BEJ9Sy/openresty-1.25.3.1/../wasm-nginx-module-0.7.0 --add-module=/tmp/tmp.8fm9BEJ9Sy/openresty-1.25.3.1/../lua-var-nginx-module-v0.5.3 --add-module=/tmp/tmp.8fm9BEJ9Sy/openresty-1.25.3.1/../grpc-client-nginx-module-v0.5.0 --add-module=/tmp/tmp.8fm9BEJ9Sy/openresty-1.25.3.1/../lua-resty-events-0.2.0 --with-poll_module --with-pcre-jit --with-stream --with-stream_ssl_module --with-stream_ssl_preread_module --with-http_v2_module --with-http_v3_module --without-mail_pop3_module --without-mail_imap_module --without-mail_smtp_module --with-http_stub_status_module --with-http_realip_module --with-http_addition_module --with-http_auth_request_module --with-http_secure_link_module --with-http_random_index_module --with-http_gzip_static_module --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-threads --with-compat --with-stream --without-pcre2 --with-http_ssl_module
apisix
apisix version
3.9.1
test-nginx: master branch
How to reproduce
Run unit tests with prove. There are many errors like timeout when waiting for the process 78711 to exit
.
prove -v -I ./test-nginx/lib -I./ t/plugin/openid-connect.t
ok 1 - t/plugin/openid-connect.t TEST 1: Sanity check with minimal valid configuration. - status code ok
ok 2 - t/plugin/openid-connect.t TEST 1: Sanity check with minimal valid configuration. - response_body - response is expected (repeated req 0, req 0)
ok 3 - t/plugin/openid-connect.t TEST 1: Sanity check with minimal valid configuration. - pattern "[error]" does not match a line in error.log (req 0)
t/plugin/openid-connect.t TEST 2: Missing `client_id`. - timeout when waiting for the process 78711 to exit at /workspace/test-nginx/lib/Test/Nginx/Util.pm line 681.
t/plugin/openid-connect.t TEST 2: Missing `client_id`. - WARNING: killing the child process 78711 with force... at /workspace/test-nginx/lib/Test/Nginx/Util.pm line 720.
ok 4 - t/plugin/openid-connect.t TEST 2: Missing `client_id`. - status code ok
ok 5 - t/plugin/openid-connect.t TEST 2: Missing `client_id`. - response_body - response is expected (repeated req 0, req 0)
ok 6 - t/plugin/openid-connect.t TEST 2: Missing `client_id`. - pattern "[error]" does not match a line in error.log (req 0)
t/plugin/openid-connect.t TEST 3: Wrong type for `client_id`. - timeout when waiting for the process 78899 to exit at /workspace/test-nginx/lib/Test/Nginx/Util.pm line 681.
t/plugin/openid-connect.t TEST 3: Wrong type for `client_id`. - WARNING: killing the child process 78899 with force... at /workspace/test-nginx/lib/Test/Nginx/Util.pm line 720.
and there are many defunct nginx processes
ps -ef | grep nginx
root 785 1 0 08:40 ? 00:00:00 [nginx] <defunct>
root 2248 1 0 08:41 ? 00:00:00 [nginx] <defunct>
root 2885 1 0 08:42 ? 00:00:00 [nginx] <defunct>
root 4446 1 0 08:43 ? 00:00:00 [nginx] <defunct>
root 5007 1 0 08:44 ? 00:00:00 [nginx] <defunct>
root 19585 1 0 09:00 ? 00:00:00 [nginx] <defunct>
root 19770 1 0 09:00 ? 00:00:00 [nginx] <defunct>
root 21483 1 0 09:02 ? 00:00:00 [nginx] <defunct>
root 25649 1 0 09:07 ? 00:00:00 [nginx] <defunct>
root 27841 1 0 09:09 ? 00:00:00 [nginx] <defunct>
root 27842 1 0 09:09 ? 00:00:00 [nginx] <defunct>
root 27843 1 0 09:09 ? 00:00:00 [nginx] <defunct>
root 27989 1 0 09:09 ? 00:00:00 [nginx] <defunct>
root 27990 1 0 09:09 ? 00:00:00 [nginx] <defunct>
root 27991 1 0 09:09 ? 00:00:00 [nginx] <defunct>
root 28104 1 0 09:09 ? 00:00:00 [nginx] <defunct>
root 28105 1 0 09:09 ? 00:00:00 [nginx] <defunct>
root 28106 1 0 09:09 ? 00:00:00 [nginx] <defunct>
root 28243 1 0 09:09 ? 00:00:00 [nginx] <defunct>
root 28244 1 0 09:09 ? 00:00:00 [nginx] <defunct>
root 28245 1 0 09:09 ? 00:00:00 [nginx] <defunct>
root 29939 1 0 09:11 ? 00:00:00 [nginx] <defunct>
The possible reason
The prove will kill the nginx process after completing one unit test. However, nginx may exit too quickly, and the prove hasn't waited for the child process to finish. As a result, the nginx process becomes a zombie process, but is_running still considers it a valid process. The prove will continue attempting to kill the nginx process repeatedly until the timeout.
test-nginx/lib/Test/Nginx/Util.pm
Lines 654 to 680 in 7375bd9
if (defined $pid) { | |
if ($ENV{TEST_NGINX_FAST_SHUTDOWN}) { | |
if ($Verbose) { | |
warn "sending TERM signal to $pid"; | |
} | |
kill(SIGTERM, $pid); | |
} else { | |
if ($Verbose) { | |
warn "sending QUIT signal to $pid"; | |
} | |
kill(SIGQUIT, $pid); | |
} | |
} | |
if ($Verbose) { | |
warn "waitpid timeout: ", timeout(); | |
} | |
my $timeout_val = timeout(); | |
while ($timeout_val > 0 && is_running($pid)) { | |
waitpid($pid, WNOHANG); | |
sleep 0.05; | |
$timeout_val -= 0.05; | |
} |
My workaround
I've modified the is_running function to recognize zombie processes, allowing the unit tests to run faster without generating as many error messages. However, it will still produce a large number of zombie processes.
The original function
test-nginx/lib/Test/Nginx/Util.pm
Lines 192 to 195 in 7375bd9
sub is_running ($) { | |
my $pid = shift; | |
return kill 0, $pid; | |
} |
My workaround
sub is_running ($) {
my $pid = shift;
return (kill(0, $pid)) && (not is_defunct($pid));
}
sub is_defunct ($) {
my $pid = shift;
my $output = `ps -o stat= -p $pid`;
chomp($output);
return $output =~ /Z/;
}