Slow Import of vHosts/RabbitMQ Definitions in RabbitMQ Cluster (Kubernetes & VMs) #5672

sky29 · 2022-08-29T08:58:20Z

sky29
Aug 29, 2022

Hi,

I am facing "Slow Import of vHosts/RabbitMQ Definitions" in a "3 node rabbitmq cluster on Kubernetes and/or 3 node VMs both".

In our test environment, I created a 3 node RabbitMQ Cluster on Kubernetes. I am using RabbitMQ 3.10.5.
I am trying to import 1000 vHosts as mentioned here:
https://github.com/rabbitmq/sample-configs/blob/main/vhosts/1000-vhosts.json

It is taking more than half an hour (We also noted initial vHosts loading were fast e.g. first 100 vHosts were imported very fast. But as the import proceeds from 100 to 1000, it exponentially slows down). I tried to import definitions using "RabbitMQ Management UI" OR "Calling API in single threaded for loop through bash script" OR "going inside the pod and loading definitions using rabbitmqctl). Also RabbitMQ Management UI slows down while import process is running (especially vHosts page).

If I use single node RabbitMQ Server (without Clustering) - it takes ~10 minutes.
I tried to tune CPU/Memory but didn't help much. RabbitMQ Management UI Overview page also not showing any Red Flags in terms of resource consumption.

Do we have any rabbitmq configuration parameter which affects "bulk vHost Definition Import" process, that we can tune ?

In our production environment, we have 1000+ vHosts on AWS EC2 VMs (3 node RabbitMQ Cluster). We saw similar issue there also where large definition import takes time and it gets slow exponentially. We are using vHosts to separate Tenant communications.

Anyone can help ?

Answered by michaelklishin

Aug 29, 2022

Starting with 3.7.0 or so, every virtual host is a separate tree of processes plus two message stores.
This means a failure of a single message store does not affect other virtual hosts and also means that starting hundreds or thousands of virtual hosts takes much longer.

In a cluster, virtual host would only be considered ready after all nodes reported that they have started the entire tree of dependent processes.

I do not expect this virtual host design to change significantly even in 4.0. So the right thing to do is to use fewer virtual hosts and/or more focused clusters (built for
a specific purpose or set of applications) instead of sharing a single cluster for everything.

View full answer

michaelklishin · 2022-08-29T09:08:38Z

michaelklishin
Aug 29, 2022
Maintainer

Starting with 3.7.0 or so, every virtual host is a separate tree of processes plus two message stores.
This means a failure of a single message store does not affect other virtual hosts and also means that starting hundreds or thousands of virtual hosts takes much longer.

In a cluster, virtual host would only be considered ready after all nodes reported that they have started the entire tree of dependent processes.

I do not expect this virtual host design to change significantly even in 4.0. So the right thing to do is to use fewer virtual hosts and/or more focused clusters (built for
a specific purpose or set of applications) instead of sharing a single cluster for everything.

1 reply

sky29 Aug 29, 2022
Author

Thanks @michaelklishin for quick reply.

mkuratczyk · 2022-08-30T08:04:54Z

mkuratczyk
Aug 30, 2022
Maintainer

I've looked into this a few months back when reported on the mailing list. I believe the problem is not really about the processes to be started (that would affect Khepri-enabled deployments as well, but importing 1000 vhosts is much faster with Khepri). I believe the culprit is Mnesia locking related to exchange declarations - each vhost has 7 default exchanges, so 1000 vhosts => 7000 exchanges to be declared. Could that code path be optimised? Almost certainly yes (PRs welcome), but we are focused on migrating from Mnesia to Khepri to avoid this whole class of issues going forward. Having said that, as Michael said, having a hundreds or more tenants in a single RabbitMQ cluster is probably not a good idea for other reasons.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Slow Import of vHosts/RabbitMQ Definitions in RabbitMQ Cluster (Kubernetes & VMs) #5672

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Slow Import of vHosts/RabbitMQ Definitions in RabbitMQ Cluster (Kubernetes & VMs) #5672

Uh oh!

sky29 Aug 29, 2022

Replies: 2 comments · 1 reply

Uh oh!

michaelklishin Aug 29, 2022 Maintainer

Uh oh!

sky29 Aug 29, 2022 Author

Uh oh!

mkuratczyk Aug 30, 2022 Maintainer

sky29
Aug 29, 2022

Replies: 2 comments 1 reply

michaelklishin
Aug 29, 2022
Maintainer

sky29 Aug 29, 2022
Author

mkuratczyk
Aug 30, 2022
Maintainer