Answer the question
In order to leave comments, you need to log in
Should You Use Mongo?
Greetings!
Recently, I hear more and more mentions of NoSQL and MongoDB in particular. The topic interested me, but so far I can’t find the information that interests me, so I’ll ask here - for sure, many have already experimented, and maybe develop serious high-load applications in conjunction with MongoDB.
I will warn you in advance if I made a mistake somewhere regarding MongoDB - I didn’t do it on purpose. It’s just that I haven’t even tried to work with her yet, but only read articles on Habré, and those examples that are on the offsite site.
Now I am developing a teaser network. The task, which at first glance seems trivial, actually turns out to be quite cunning in terms of organizing the structure of the database. A huge number of relationships, a lot of intermediary tables for M-M relationships, etc. What attracted me to the idea of MongoDB was its relationship building principle. Question number 1:
is it really less expensive to work with MongoDB in the presence of a bunch of links in terms of resources? Well, at least on the simplest example (I will write in "pseudo SQL") - a selection of 2 tables connected by an M-M relationship through an intermediate table:
table sites(
id int primary key auto_increment,
url varchar
)
table categories(
id int primary key auto_increment,
name varchar
)
table sites_categories(
site_id int,
category_id int
)
The task is to display a list of sites and categories that have it:
SELECT * FROM sites
while(SITE = mysql_result...)
{
//display site data
SELECT * FROM categories WHERE id IN (SELECT category_id FROM sites_categories WHERE site_id = SITE)
// display categories in a loop
}
I'm also wondering if it's possible to work with MySQL and MongoDB at the same time? Or rather, how right would it be? I don’t want to completely transfer the database to Mongo, only separate, especially tricky sections, the load on which is higher than I want.
I also read that you can easily store files in MongoDB - is this really the case and what would be better - store it the old fashioned way in a special folder with subdirectories by usernames / user ids, or use MongoDB? (for example, in this situation: there are about 1k users, each has 40-50 small pictures. Pictures are given in the amount of about 100-150 per minute.
PS: I apologize for possible inaccuracies in questions, unnecessary or unsaid information about needs and current state of affairs, the development of database structures is not my main advantage ...
Answer the question
In order to leave comments, you need to log in
I apologize for offtopic, but in your examples like
SELECT * FROM categories WHERE id IN (SELECT category_id FROM sites_categories WHERE site_id = SITE)
JOIN is basically not used???
Try storing pictures on Amazon.
Be careful! The principles of building a database in Mongo are different. You need to understand how it stores objects and what loads the entire object.
In general, IMHO, many-to-many relationships are not a strong point of Mongo (and other NoSQL databases that I am familiar with). They work most effectively with built-in objects, that is, one-to-one and one-way one-to-many relationships.
For mongo, each site must include a category_id with a category enum. That is, for nosql, the many-to-many relationship is implemented by storing in one of the objects a complete enumeration of relations to the second.
To get started, check out the document www.mongodb.org/pages/viewpage.action?pageId=5079114
Here is an example of what it might look like
pastie.org/1226804
And a request, for example, to get all the teasers for Moscow where there are still clicks, for example, more than 10
pasties. org/1226857
Regarding reliability and speed, honestly xs, I have not yet experimented globally, here you will have to conduct research yourself, I can say one thing, the speed of implementation and expansion pleases :)
Offtopic again:
In general, I was finalizing the teaser network alone ... There is no targeting there, but it normally worked out about 90 million teaser impressions per day (now the turnover has dropped to about 50 million). All this is spinning on one single rather powerful server (and the second server is just pictures).
There, all data about teasers, banned IPs, cost per impression/click are taken from the memcache. If for some reason there is no data in the memcache (for example, Eviction or a cold start), then only then is it taken from the database and put immediately back into the memcache. Statistics for the last minute is also put into the memcache and once a minute the data from it is collected, calculated and the generalized statistics are put into MySQL. Those. in fact, for the promotion of teasers, the database is not touched at all.
(Before the memcache, they generally used the file system for this data - it flew almost the same as on the memcache)
But it may not suit you because I don’t really understand how targeting can be implemented there.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question