Should I take mongodb for a new project?

A

Alexander Dementeev2014-02-19 02:16:28

MySQL

Alexander Dementeev, 2014-02-19 02:16:28

Hello. We are developing a service - such as the collection and subsequent analysis of statistical data. We choose the base and there are doubts.
I think about mongo:
There is a collection with projects. With a certain frequency, data is collected for each project and entered into another collection: one entry for each day. Inside such a record, a set of all sorts of different characteristics is nested. It is with these characteristics that the end user works.
It turns out the logic of working with mongo: in order to select data for further processing and output to the user (graphics, analysis), you need to do this:
1. Select a project by ID - in theory it should happen very quickly to check the access rights to the project, some settings , plus the categories data to which will be selected in step 2 are stored there.
2. Select data records for the project for a certain period, grouping them into categories (well, that is, go through all the categories and select data for the period for each) - this should also seem to be normal.
3. Further data processing takes place inside the script and is quite fast.
On an almost empty database, this whole cycle takes 22-35 ms
. If you do the same on MySQL, the meaning remains the same, but you have to use joins and more queries, which increases the running time to 57-76 ms.
The collection of statistics in the first year is predicted to be as follows: about 1,000,000 inserts and updates per day, i.e. ~12 requests per second. Readings - more.
That's what I think. On the one hand, the speed, of course, captivates, and mongo still "matured" and many of the jambs of the first versions now seem to be gone. Though business not only in speed, but also in logic. For example, no terrible selections are planned on the database side - basically, all the logic is packed into scripts. Again, I liked working with a document, not with data. On the other hand, mongo has a lot of negative or simply not entirely clear points - data loss, blocking on the entire database. With traditional MySQL, you feel confident.
In general, I would like to know the opinion of people who have already experienced working with mongo in serious projects. How difficult will it be to expand later, how likely is data loss (by the way, I still don’t understand under what circumstances they can be lost), what other pitfalls are there? Would you hire him a second time for the same purposes, or would you work the old-fashioned way?
Thanks for reading.

Reply

Answer the question

In order to leave comments, you need to log in

4 answer(s)

J

Jedi_PHP, 2014-02-25
@progressor0

Get better Postgres.
It knows JSON fields out of the box (9.2) and queries on JSON fields (9.3).
Yes, it is slightly slower than MySQL, but they are both faster than Mongi - fill in at least 10M random records and see for yourself.
Yes, the cluster will be a little more difficult to build, but it is quite within the power of a developer who has spent a day reading the documentation.
And there will not be, unlike Mongi, restrictions on one thread per entry. This was never fixed in Monge and it looks like it will never be fixed. (It was very disappointing to stumble upon this when the project grew, when the database simply did not have time to chew on the records and hung). There will be no cryptic addition of 100 milliseconds to any request from mongos when sharding.
In addition, aggregation is inconvenient in MongoDB - in fact, you have to write a query execution plan by hand, with classic SQL, everything is more or less simple and understandable.
For very, very large volumes of statistics on which analytics will need to be done, I recommend considering column databases , some have shown themselves very well.

D

Dan Ivanov, 2014-02-19
@ptchol

If I understood correctly, then in fact this is time-series data, but the data structure is a kind of tree, and the notation can be represented as
Domain.Group.Entity.Properties, and you need to be able to group by each of the nodes.
Maybe look at specialized solutions like influxdb.org then ?

Z

zarincheg, 2014-02-19
@zarincheg

MongoDB is perfectly usable. I took it for a similar task in the current project and do not complain yet. Also collecting and processing various statistics. From convenient - some selections / aggregations were taken out to server js, which is in mongo itself, well, like stored procedures. Resp. in the application code, calls have become simpler. So far, it is spinning in test mode, but there have been no force majeure and losses so far. But it is worth remembering about the correct replication. Locks on the base somehow don’t really interfere yet, but 10gen promises to soon make them at the collection level.

G

grossws, 2014-02-20
@grossws

On the topic of non-guaranteed writes -- docs.mongodb.org/manual/core/write-concern