Answer the question
In order to leave comments, you need to log in
A large number of files and folders. Crush or not?
For example, there are a million folders, each with a million files.
1. We just write this million folders to one main folder.
2. We divide the main folder into another 100 folders, and we divide a million of our folders between these 100 folders
How does the linux file system work, that is, how does it find the file (by sorting through each folder until it satisfies the request, I mean if I specify the path / main / 56/1.jpg does it check the names of the previous 55 folders against the digit 56, that is, starting from the first folder, is it 56? - no, is it 56? - no) and is there a fundamental difference in performance when accessing a specific file.
That is, /main/1/2/1.jpg or /main/1/1.jpg
Answer the question
In order to leave comments, you need to log in
> there are a million folders in each of which there are a million files.
It seems you are ripe for a DBMS.
It is necessary to split up, there are many reasons for this, in fact, it all comes down to the fact that the more objects in the folder, the more resources are needed to search for them (and other operations). In some cases, in proportion to size...
Finding a file (or an empty space to create a new one) among a million of the same in a folder is more difficult than first finding its subfolder among a thousand subfolders, and then finding the desired file among a thousand in the selected subfolder. Even with folder indexing techniques...
In total, it is desirable that the maximum number of objects in the folder be about 1k-5k. Adding one-character subfolders, for example, is not efficient - there will be a lot of them and the number of search operations for an element in the folder will greatly increase (although these operations will be relatively simple). Ideally - 3 (maximum 4) digits per subfolder, or 2 characters including letters (for even character density in the name).
I solve this problem by generating a uinqueid directory in php (any other analogue will do) and further splitting the resulting value into subdirectories. The degree of fragmentation is determined by the frequency of adding new files. As a result, the resulting solution is acceptable for any actual file system.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question