Answer the question
In order to leave comments, you need to log in
transactional messaging. What implementations of transactional message queues exist?
Hello.
In the last almost 5 years, I had a chance to participate in different roles on three fintech projects with microservice architecture. While working on the last of them, I read the book “Microservices. Development and Refactoring Patterns by Chris Richardson and drew from there the desire to implement a transactional queue.
Questions:
1 I have already seen both (`Transaction log tailing` and `Polling publisher`) versions of the implementation proposed by Chris Richardson using the `OUTBOX` table. What other alternative implementations are possible?
2 Will it all be too slow?
3 What implementations/tips are possible for stack: python, postgres, nats?
4 I once saw plugins for postgres with direct integration with either Kafka or Rabbit, maybe I can find them for nats. Is it a good idea to publish events to the queue directly from the database? I see a minus in the fact that it is not possible to find such a solution for every database. In my project, not only postgres, there is also elastic, but so far in services with it, such transactionality is not really needed.
UPD
5 If everything is easy and clear with the `Polling publisher` pattern, then the `Transaction log tailing` pattern caused and still raises questions for me. Having figured out that in the case of postgres we are talking about background reading from WAL and processing the read data, I began to look for ready-made solutions related to this. It seemed that there was no information other than theoretical, but I often came across the mention of the ETL pattern "Data Change Capture" (Change Data Capture). I found some open source solutions ( Debezium , Wso2) that implement this pattern and, on the one hand, can collect data from WAL, and on the other hand, transfer it to nats / kafka, etc. It seems to me that this is what is needed. Now the question is: how to delete the processed data from the `OUTBOX` table after they are delivered to the broker by the CDC tool?
Below is a description of why I want to try ...
One way or another, we used the "Narratives" (Saga) and "Event Sourcing" patterns in all projects, which, in the process of their execution, introduce the entire system into a state of "delayed consistency". This is the state when each service is coordinated on its own, but on the scale of the entire system, the operation has already been completed in some of the services, and not yet in the other.
By and large, this is normal and not at all scary, because "in an ideal world" the operation will eventually come to a consistent state if there were no errors. If errors occur, then compensating transactions will be launched, which will also return the system to a consistent state.
However, it was not uncommon for all three projects to fall into a situation where they could not get out of the pending consistency state on their own. In other words, sometimes in order for this or that operation to be finally executed or rolled back, it had to be "kicked". She was kicked either by the automated `health-check-poller` mechanism, which looked for such hung operations, or by a second-line support employee. after which the process continued and brought the transaction into a consistent state.
Almost always, the problem was due to the fact that one service processed its part of the saga, made changes to the database and could not notify the rest of the system about it, i.e. did not publish the event to the queue. Chris Richardson argues that both the Narrative and Event-Event patterns should be implemented via transactional queues, thereby ensuring that the service takes the message from the broker's queue and either executes the business logic, informs the rest of the participants, and marks the message as finished by deleting it from the broker's queue. broker queue, or do neither and return the message to the broker queue.
I am now convinced that having a transactional queue should greatly increase system reliability and reduce troubleshooting and support costs.
Thank you!
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question