D
D
Dmitry Labutin2015-07-08 14:54:44
PHP
Dmitry Labutin, 2015-07-08 14:54:44

How to properly organize fault tolerance when working with a RabbitMQ cluster?

I'll start with an analogy.
Let's say MongoDB. There are three nodes in the replicaset. All three are indicated in the connection string and the driver itself determines who is PRIMARY now and works with this node.
But what about RabbitMQ? Collected a cluster of rabbits.
How to connect to the cluster correctly? Those. if one node fell, how to continue working with another? In the connection string to the rabbit, it seems like only one address is indicated (that is, we always connect to one node).
As an option, put HA-Proxy in front of the rabbits and it will already monitor the nodes and send a request to the live one. Is this the only option? Or is there a better one?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
S
Sergey, 2016-06-07
@yarkin

At the data level. When the ResplicaSet is raised in MongoDB, she understands that the data needs to be duplicated on several nodes. When a RabbitMQ cluster is created (in fact, it is a cluster of Erlang machines), then only metadata (information about exchangers, queues, etc.) is duplicated on all machines, but the data itself remains to live on one node (but is transparently transmitted/collected if the client is connected to another cluster node). They have a doc article on resiliency to data loss .
at the connection level.The MongoDB driver opens connections to each ReplSet node (maybe only up to a certain number of them, I'm not sure exactly) and, depending on the operation and its parameters, uses one connection or another. That is, ReplSet support is provided at the driver level. RabbitMQ works on the AMQP protocol (native), which works with only one broker node, most likely there are a bunch of libraries that wrap the connection to several RabbitMQ servers (either one connection and the other rises when the first one fails, or several connections at once, which give balancing and small lag when one node fails). But, if there is no way to work with several nodes, then I think it will not be any particular problem to make it.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question