W
W
WebDev2017-01-14 21:44:43
MongoDB
WebDev, 2017-01-14 21:44:43

A few questions about mongodb?

Hello, I have never worked with mongodb, I started to understand and this is not entirely clear: in various articles they talk about the structure of documents in mongo, which in fact are corny denormalized data in the concept of sql with all the ensuing minuses. The only difference from sql is the lack of a schema.
Hence the speed, because if you store all the data in one table in mysql, then the speed will also be very high.
At the same time, they enthusiastically talk about the "pluses" of such denormalization and present it as if it were a discovery. It turns out that we have been taught to normalize data all our lives and explained why it is good, but now everything is exactly the opposite?
Explain, please, on the fingers, do I understand everything correctly? How in mongo to update the data that is stored in each document? Is it acceptable in mongo to denormalize data to store data in a separate collection, and access it with a separate request (because there are no joins)? And is this how it works? What to do if the data structure has changed, if there is no structure?
Thank you.
UPD:
For clarity, please explain how you would implement the structure in mong, which in sql looks like this:
users - site users.
authors - authors of articles on the site.
news - news published by authors (foreign key for authors).
author_user- link between users and authors, users subscribe to their favorite authors to read their news (foreign key on authors and users).
The task of the project is to show the user the news of the authors to which he is subscribed and, in fact, manage the subscription to the authors.
I have 2 options in my head:
1) Create 4 collections with the appropriate names and store the id of the required entities, but it turns out the same sql and you have to choose from different tables. I think this is the wrong move.
2) Store denormalized data about the authors of these articles in the collection with posts. It’s clear here, but it’s not clear how to add users? After all, today 10 people have subscribed to the author, and tomorrow 100.
In addition, for example, tomorrow posts will need to add a new "genre" field, and all existing ones will need to change the format of the "news rating" field. It turns out that the new documents will have a genre field, but the old ones will not? Will this have to be tracked at the application level?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
P
Philipp, 2017-01-15
@kirill-93

I have been working with MongoDB for more than 3 years, so I will talk and give advice based on my personal operating experience.
Not really. You were taught to work with only one kind of database - relational. Now you have seen that there are also others, document-oriented. Of course, each variety will have its own approaches to storing and organizing data.
It's not good or bad, it's different .
Undoubtedly, there is a hype around the term NoSQL. And there are reasons for that, mainly that there really is more data. Information entropy is increasing and it is becoming more and more difficult to fit it into the framework of relational databases. Here you can argue for a long time, but I can say with confidence that now there is a demand for such storages in which the structure needs to be changed more quickly than relational databases can allow.
For the most part you are right. This is denormalized data, but with certain points. I will show you them on your own example.
This is implemented with the help of banal updates.
For example, if I have a collection with books in which I need to update the authors.
A typical entry looks like this

{
    "_id" : ObjectId("5801aa17964c6b2a050041a7"),
    "title" : "New Book",
    "authors" : [ 
        {
            "_id" : ObjectId("5801aa0f964c6b26030041a9"),
            "firstName" : "Phil",
            "lastName" : "Tkachev"
        }
    ]
}

And I want to replace the name in those books in which I am the author, then my query will look like this:
db.getCollection('book').update(
 {'authors._id':ObjectId("5801aa0f964c6b26030041a9")}, 
 {$set: {'authors.$.firstName': 'Philipp'}  }, 
 {multi: true } 
)

There are a number of different scenarios here, just read the documentation. Everything is well written in it.
There are a number of cases where this is done. For example, there are various kinds of ORM, the same Mongoose, which does just that.
Yes and no. When you work with this kind of database, you need to approach the organization of data based on the problem being solved, starting from the design of the future application or problem solution.
You just need to answer the question, which is cheaper, request a document by key, or update a record inside documents.
Take for example your site, which has news and their authors. News can be read by millions, which means that when accessing each news, you will need to make a subquery for information about each author. Those. instead of one request, when viewing the news, you will need to do 2. And if you show a list of 100 news? Will you make 100 secondary requests? No, that's also wrong. You will need to get a list of news, collect the author IDs in the application code, make a second subquery, get information about the authors, then combine it with the already received list of articles. This will complicate your application a little, but it will also save resources. If you embed the authors inside the article, this will allow you to get by with one query to the database, at least for viewing, at least for the list of news. On the other hand, you will have to think about updating the author information. But, because such information changes relatively rarely, that is, the meaning of embedding.
Everything is simple here. When you develop your application, you initially put change handling into it. For example, you can add a document version field in which you store the version number of the structure and respond to changes in the code. Or you can just write the application in such a way that it will automatically convert the structure from the old to the new one the first time it is accessed.
Judging by the primary data, you have a news site.
It would be logical to present it in the following form.
News collection:
{
  _id: 'MongoId',
  title: '',
  body: '',
  author: {
    _id: 'идентификатор пользователя',
    name: 'Имя пользователя',
    subscribers: 'Количество подписчиков'
  }
}

The author is an incomplete copy of the user data. This will help save space and avoid unnecessary queries.
User Collection:
{
  _id: 'MongoId',
  name: 'Имя пользователя',
  email: '',
  roles: ['user', 'author', 'admin'],
  subscribers: 'Number'
}

The list of roles can define the level of available capabilities. It is easy to change, you can easily find authors or admins.
Collection of subscriptions:
{
  initiator: {
    _id: 'идентификатор пользователя, который инициировал подписку',
    name: 'Имя пользователя'
  },
  target: {
    _id: 'идентификатор автора',
    name: 'Имя пользователя'
  },
  date: 'ISODate',
  confirmed: 'bool'
}

Here you can fine-tune both the list of subscriptions and the list of subscribers with a single request.

L
lega, 2017-01-14
@lega

It is convenient that monga allows you to set the desired level of (de)normalization, this allows you to speed up the development and performance of the application.
There are no joyons, but there are uncomfortable lookups.
Data is normally updated in Mongo, there are a lot of different options for this.
It is better to store the structure in the application (although you can set some kind of validation in the monge).
As for me, for most projects there are not enough normal transactions, with the rest more or less normal.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question