Skip to content

Extremely Slow Startup #9657

@rnemeth90

Description

@rnemeth90

Version

5.3.1

What Kubernetes platforms are you running on?

AKS Azure

Steps to reproduce

  1. Deploy nginx to AKS
  2. Deploy several thousand ingress objects (8000+)
  3. Scale out nginx or perform a rollout
  4. Watch the pods take 20+ minutes to pass the readiness probe

We have around 150 namespaces that have ingresses configured to use this particular ingress controller. Though, that number may vary from day-to-day.

These namespace each have 5 master ingresses (each for a different domain name) and 9 or 10 different deployments that create a minion for each master (so 5 minion ingresses per service). In addition, we have a 'headless' service that has an endpoint that is mapped to the IP address of a VM in Azure (these are the '-proxy-' minion ingress objects you see below.

ingress-master-apiautortn67
ingress-master-excelsiorautortn67
ingress-master-rtn67-financial1
ingress-master-rtn67-financial2
ingress-master-us1-rtn67

λ kubernetes-workspace (wip/586041) $ k get ing | grep -i minion| awk '{print $1}'
aprimo-csp-rtn67-minion--apiautortn67-labs-aprimo-com
aprimo-csp-rtn67-minion--excelsiorautortn67-labs-aprimo-com
aprimo-csp-rtn67-minion--rtn67-financial1-labs-aprimo-com
aprimo-csp-rtn67-minion--rtn67-financial2-labs-aprimo-com
aprimo-csp-rtn67-minion--us1-rtn67-labs-aprimo-com
aprimo-upload-rtn67-minion--apiautortn67-labs-aprimo-com
aprimo-upload-rtn67-minion--excelsiorautortn67-labs-aprimo-com
aprimo-upload-rtn67-minion--rtn67-financial1-labs-aprimo-com
aprimo-upload-rtn67-minion--rtn67-financial2-labs-aprimo-com
aprimo-upload-rtn67-minion--us1-rtn67-labs-aprimo-com
budget-ui-rtn67-minion--apiautortn67-labs-aprimo-com
budget-ui-rtn67-minion--excelsiorautortn67-labs-aprimo-com
budget-ui-rtn67-minion--rtn67-financial1-labs-aprimo-com
budget-ui-rtn67-minion--rtn67-financial2-labs-aprimo-com
budget-ui-rtn67-minion--us1-rtn67-labs-aprimo-com
cube-stack-pm-rtn67-minion-us1-rtn67-labs-aprimo-com
idealab-api-rtn67-minion--apiautortn67-labs-aprimo-com
idealab-api-rtn67-minion--excelsiorautortn67-labs-aprimo-com
idealab-api-rtn67-minion--rtn67-financial1-labs-aprimo-com
idealab-api-rtn67-minion--rtn67-financial2-labs-aprimo-com
idealab-api-rtn67-minion--us1-rtn67-labs-aprimo-com
lab-proxy-rtn67-minion-apiauto
lab-proxy-rtn67-minion-excelsiorauto
lab-proxy-rtn67-minion-financial1
lab-proxy-rtn67-minion-financial2
lab-proxy-rtn67-minion-us1
marketingopsapi-rtn67-minion--apiautortn67-labs-aprimo-com
marketingopsapi-rtn67-minion--excelsiorautortn67-labs-aprimo-com
marketingopsapi-rtn67-minion--rtn67-financial1-labs-aprimo-com
marketingopsapi-rtn67-minion--rtn67-financial2-labs-aprimo-com
marketingopsapi-rtn67-minion--us1-rtn67-labs-aprimo-com
marketingopsui-rtn67-minion--apiautortn67-labs-aprimo-com
marketingopsui-rtn67-minion--excelsiorautortn67-labs-aprimo-com
marketingopsui-rtn67-minion--rtn67-financial1-labs-aprimo-com
marketingopsui-rtn67-minion--rtn67-financial2-labs-aprimo-com
marketingopsui-rtn67-minion--us1-rtn67-labs-aprimo-com
pushnotification-rtn67-minion--apiautortn67-labs-aprimo-com
pushnotification-rtn67-minion--excelsiorautortn67-labs-aprimo-com
pushnotification-rtn67-minion--rtn67-financial1-labs-aprimo-com
pushnotification-rtn67-minion--rtn67-financial2-labs-aprimo-com
pushnotification-rtn67-minion--us1-rtn67-labs-aprimo-com
reviewfileservice-rtn67-minion--apiautortn67-labs-aprimo-com
reviewfileservice-rtn67-minion--excelsiorautortn67-labs-aprimo-com
reviewfileservice-rtn67-minion--rtn67-financial1-labs-aprimo-com
reviewfileservice-rtn67-minion--rtn67-financial2-labs-aprimo-com
reviewfileservice-rtn67-minion--us1-rtn67-labs-aprimo-com
uls-rtn67-minion--apiautortn67-labs-aprimo-com
uls-rtn67-minion--excelsiorautortn67-labs-aprimo-com
uls-rtn67-minion--rtn67-financial1-labs-aprimo-com
uls-rtn67-minion--rtn67-financial2-labs-aprimo-com
uls-rtn67-minion--us1-rtn67-labs-aprimo-com

In addition, we 'shutdown' some deployments in AKS on a schedule (i.e. we scale deployments to 0 replicas) to save on costs. When we do this, the endpoint slices associated with these deployments have no IP addresses. It seems as though Nginx adds a default IP address to the upstream (127.0.0.1:8181) if it 'discovers' these endpoint slices with no IP address (we have observed this in the logs).

Address: "127.0.0.1:8181",

Metadata

Metadata

Assignees

Labels

backlogPull requests/issues that are backlog itemsenhancementPull requests for new features/feature enhancements

Projects

Status

In Review 👀

Relationships

None yet

Development

No branches or pull requests

Issue actions