How to correctly build work with broker queues/topics in a transactional system?

K

ksimmi2020-08-25 12:08:17

Fintech

ksimmi, 2020-08-25 12:08:17

Good afternoon!
My team and I were hunted into a new project, hunted for our successful four-year experience in developing a microservice architecture platform for one bank from scratch, within which it operates: Internet banking, electronic money and a payment system for various services integrated with several acquiring.
In fact, now they want the same thing from us that we did for the bank, but there is one thing - the requirements laid down inter-service calls through Apache Kafka. The current project is much larger than the past, the transactional system (the responsibility of my team) is about 1/6 of the entire project. In other words, we have 6 teams working and the transport layer between services is implemented through Kafka. We have experience with interservice calls only via HTTP. I have nothing against Kafka, in general I think it's even more correct than HTTP, but we have absolutely no experience with any queue brokers at all.

The background of the problem in general is this: at first we were given requirements to use rabbitMQ, which we also did not know. It was interesting - they read it, applied it, it works. Two months later, an expert was hired to the team, who, with some arguments, convinced the architect to change the transport layer to Kafka. Well, well, they read it, spent a week reworking it - it works, nothing has fundamentally changed. Another 4 months passed and that expert quit. In fact, the total staff of 6 teams is about 35 people and no one has experience with message brokers.

Everything works on test servers, but I'm afraid of going into production. Roughly speaking, the transactional system is implemented according to the SAGA pattern based on orchestration, i.e. there is a main service orchestrator transactions and services subordinate to it, which are divided into two groups:

1 for which payment is made:
1.1 payment for an order from our marketplace;
1.2 payment for the order of partner services;
1.3 payment for other services integrated through service aggregators, such as QIWI;
2 how payment is made:
2.1 card;
2.2 a loan from one of the banks;
2.3 bonuses;
2.4 partly bonuses.

Each sub-item is a separate service or a group of services, at the moment there are a total of 15 of them and a few more are being developed. The service orchestrator determines "for what" and "how" the payment is made and delegates to slave services. If it was HTTP, then it would be two or three regular HTTP requests and that's it. What about Kafka? Now, by analogy with HTTP, each has its own topic to which each service is subscribed and, depending on the type of payment, simply publishes an event in different queues to the service orchestrator. So far, it works, but is this method without drawbacks? Now I have an idea that I could make one single queue and subscribe all services to it, but I would simply add to the message the transaction type for which each subscribed service would understand this message for him or not.

Which of the options is correct? Maybe both are bad, how to do it?

Thanks for the help and advice!

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

K

ksimmi, 2021-05-31
@ksimmi

I decided to answer the question myself. Service management is built according to the SAGA pattern based on orchestration, the service implementing the saga has an INDIVIDUAL command channel to each service member of the saga, and a GENERAL channel with responses from them.
Thanks to everyone who participated in the discussion.
Zhainar
Thanks for the reference to Chris Richardson's book, I read it all. It was very useful, I began to better understand the SAGA pattern, I found the answer and help in the book.
SirotaKazansky
What I called "frequent consistency violations" is called "delayed consistency" by Chris Richardson. This is a state where each service is itself in a consistent state, but the system as a whole may be in the process of establishing that consistency. The specialist of the second line of support, which I spoke about earlier, in case of a situation when the system for some reason does not leave the delayed consistency, after analyzing this situation, simply restarts a specific step of the saga or initiates its cancellation by launching compensating transactions.