Answer the question
In order to leave comments, you need to log in
Performance Questions + What is the best base to use?
I am making a system of statistics like Yandex metrics in PHP in addition to the teaser network.
Where is the best place to store user statistics? In what base?
Will MySQl handle hundreds of millions of records?
What is the best way to reduce server costs and get maximum performance?
Technologies that I plan to use:
PHP, Mysql, NodeJs
Also, I'm wondering how to determine how many requests simultaneously access the php file and domain using standard linux tools?
Answer the question
In order to leave comments, you need to log in
it all depends on what you will do with this data. If you just store it, then it will last of course. If you make complex selections, then it depends on the load and the number of queries, and also whether you placed the indexes or not, and all that. Well, for such a selection of memory for indexes, you need to decently tune the mysql settings.
If you are interested in how to speed up writing, you can first buffer everything (for example, redis) and then put everything into the database in batches.
If you are interested in how to speed up reading - caching, indexes, aggregation with things like elasticsearch. But again, only if you have performance issues. Don't over-optimize. First, write load tests and see how bad things are and if something needs to be done.
Consider if you can't get by with NoSQL. If not - PostgreSQL + NoSQL storage. I would take Redis.
Such a service involves the accumulation of a huge amount of data. I would not take MySQL, so as not to dig a hole for myself at the beginning of the journey. Perhaps I will join the speakers above and offer a bunch of PostgreSQL for operational analytics and NOSQL or HDFS for long-term analytics.
I would consider the Elasticsearch option. :) https://www.elastic.co/products/elasticsearch
we use Cassandra to store statistics
its advantages - it is highly scalable, fault-tolerant
it is easy to use just for the purpose of statistics it is
fast to write, which is very important for various affiliate and meager programs
disadvantages - it eats up a lot of space (more than predicted)
network load is more than 1.5M per day
, if your load is several times less, then RDBMS (muscle, postgres) is enough, but if it is commensurate, then sooner or later you will run into a scaling problem
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question