Answer the question
In order to leave comments, you need to log in
What database is better to use for storing and processing large amounts of data?
Good afternoon, ladies and gentlemen!
It's time to write your thesis. In the process, a question arose to which I could not find a clear answer. The bottom line is which database (relational or non-relational) is preferable to choose for working with large amounts of data. These data are obtained during testing of various kinds of complex technical systems (rocket engines, aircraft engines, etc.). That is, the volume of data received is very large. How to store them in order to read / write conveniently and quickly, and it is possible to carry out any processing manipulations?
I would like to know the opinion of more experienced people in this matter.
Answer the question
In order to leave comments, you need to log in
Since this is your graduation project, then delve into the essence.
As for what was written here about ElasticSearch and NoSQL, is it just a tribute to fashion, or the person who answered you does not understand what it is about at all.
Different databases scale with different success. It's true. And it seems that RDBMS, in general, scale worse than NoSQL. However, there are situations - when, for example, NoSQL is good, but there are also those where they are bad on the scale and it would be better to choose RDBMS
https://habrahabr.ru/post/231213/
But without knowing which data and which volume in question, it is impossible to say anything specific. Also about what kind of load - data sampling, adding data, storing data. It is quite possible that SQL would be fine as well.
It is the ability to choose. configure and program for a specific task for a specific database - this is what specialists get a lot of money for.
But this is unfounded - you need to take NoSQL and period - this is the answer of the junior.
First you need to decide what you want to build. Judging by your description it will be Data Warehouse. If you need to do complex analysis, then you can choose between SQL Server and Oracle. Data DBMS has a very good query optimizer and many different analytics functions. Plus, in SQL Server (I won’t say for sure about Oracle) there are columnstore indexes, which in your case can reduce the size of your file storage. Much more can be said, but the task you set is vague (what data growth, structure, etc.)
If you don’t really want to bother with relationality, then pay attention to NoSQL.
If you need to quickly write large amounts of information to the database, then most likely you will have to do it through an intermediate file (the file where you write the data stream in a "raw" form), because. writing to the database is usually a rather expensive operation.
Then you feed this file to a separate program that will write it to the database, according to a schedule, or somehow synchronize processes, but at a speed that the selected database is capable of issuing.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question