How to properly organize MongoDB architecture?

Q

qovalenko2019-05-10 00:16:10

MongoDB

qovalenko, 2019-05-10 00:16:10

You need to design a database, tell me the right solution:
In my logic, I came to this solution:
MongoDB. The first collection is the same type of documents, each of which contains: gender, weight, height, age and a link to the clothes he is wearing. The second collection contains the documents referenced by the first: Type of clothing (t-shirt, jeans), its color and who refers to it.
How do I now organize the selection: Select all height values that refer to the blue pants.
I understand that SQL logic is suitable for solving this problem, the fact is that I greatly simplified the example for the question, in fact, in the first collection, the documents contain a large number of parameters and in some they are present and in some they are absent, for this purpose I chose Mongo and now the data is precisely grouped by parameters into another collection, it turned out very convenient, except for one BUT!

Reply

Answer the question

In order to leave comments, you need to log in

3 answer(s)

N

nrgian, 2019-05-10
@qovalenko

Never so.
You are doing what you would do with a relational DBMS like MySQL and so on.
There communications between tables are norm.
And Mongo handles them very badly.
It does denormalization.
Here I join grinat
If you already work with methods intended for relational DBMS, then:
UPDATED:
qovalenko ,

I understand you, the fact is that if you add the data of the second collection to the data of the first, then when updating this data, you will need to change them in several documents of the first collection, and this is not so convenient.

This is fine.
This is a consequence of denormalization .
This is such a price for the benefits of Mongo.
If you want to use the normal form, without duplicates, then you have a direct path to relational DBMS: PostgreSQL / MySQL / MS-SQL / Oracle, etc.
After all, NoSQL is not just so fast and not just so well scalable.
Do you really think that relational DBMS developers have been creating them for more than 40 years and cannot achieve such impressive results as NoSQL achieved in a ridiculous 10 years?
In Mongo and other NoSQL, a lot of things are cut down compared to strict DBMS, which are relational. And only this allows them to work quickly and scale easily.
But you have to pay for everything.
Well, for example, what does it cost that the data on the Mongo servers will become correct during replication "sometime later, but we don't know exactly when" Eventually Consistent
denormalization.
I'll tell you more - if you don't want the performance of your system to sag - then you will have to eliminate these duplicates not immediately upon change, but by some separate synchronization procedure, launched, for example, once an hour. And during that hour, one part of your Mongo will have some data, and another part will have different data.
What you want to do - with normalizations - cannot be done in Mongo for reasons of performance and the correctness of transactions.
Well, it's not meant for that. This is exactly what Mongo cut out (more precisely, it was not initially implemented) in comparison with relational DBMS.
Only in relational DBMS you can do everything exactly the way you want (but there you will pay with scaling limits).
If the project is not very large (let's say: data sizes are a few terabytes or less, which allows using 1 server for all data; and the maximum number of servers for replication is 2-3) - then relational DBMS will be very productive and there is no point in Mongo.
Here on the video everything is intelligibly explained - where who has what advantages and what disadvantages:
Postgres vs Mongo / Oleg Bartunov
If you like Mongo because it's schemaless, PostgreSQL already has it
. Smart jsonb indexing | Oleg Bartunov, Nicky...
From now on, you don't have to register all the fields separately in CREATE TABLE (but it is still desirable to separately register through which the links between tables are made - that is, all kinds of IDs - so that the query optimizer works better)
Attention, for this PostgreSQL uses JSONB data type, not to be confused with just JSON
If you want to stay with Mongo, then you need to do so that 1 user request in your online store (or whatever you have) ultimately boils down to 1 request to retrieve data from one - the only Mongo collection.
And this means that denormalization will be needed, which means data duplication. Which leads to the need to synchronize duplicates.
At the same time, if the data change is intensive, then the synchronization of duplicates will have to be done deferred (by cron, etc.), and not immediately at the time of recording.
This is normal in Mongo. The Mongo developers themselves recommend doing this.

L

longclaps, 2019-05-10
@longclaps

The first collection is the same type of documents, each of which contains: gender, weight, height, age and a link to the clothes he is wearing.

Typical SQL table.

The second collection contains the documents referenced by the first: Type of clothing (t-shirt, jeans), its color and who refers to it.

Typical SQL table.

G

grinat, 2019-05-10
@grinat

Correct would be to remove mongo and install mysql/posgres