G
G
Goodver2012-05-05 16:12:48
linux
Goodver, 2012-05-05 16:12:48

A large number of files and folders. Crush or not?

For example, there are a million folders, each with a million files.

1. We just write this million folders to one main folder.
2. We divide the main folder into another 100 folders, and we divide a million of our folders between these 100 folders

How does the linux file system work, that is, how does it find the file (by sorting through each folder until it satisfies the request, I mean if I specify the path / main / 56/1.jpg does it check the names of the previous 55 folders against the digit 56, that is, starting from the first folder, is it 56? - no, is it 56? - no) and is there a fundamental difference in performance when accessing a specific file.

That is, /main/1/2/1.jpg or /main/1/1.jpg

Answer the question

In order to leave comments, you need to log in

5 answer(s)
@
@sledopit, 2012-05-05
_

Crush. Details .

C
ComodoHacker, 2012-05-05
@ComodoHacker

> there are a million folders in each of which there are a million files.
It seems you are ripe for a DBMS.

Z
zuborg, 2012-05-05
@zuborg

It is necessary to split up, there are many reasons for this, in fact, it all comes down to the fact that the more objects in the folder, the more resources are needed to search for them (and other operations). In some cases, in proportion to size...
Finding a file (or an empty space to create a new one) among a million of the same in a folder is more difficult than first finding its subfolder among a thousand subfolders, and then finding the desired file among a thousand in the selected subfolder. Even with folder indexing techniques...
In total, it is desirable that the maximum number of objects in the folder be about 1k-5k. Adding one-character subfolders, for example, is not efficient - there will be a lot of them and the number of search operations for an element in the folder will greatly increase (although these operations will be relatively simple). Ideally - 3 (maximum 4) digits per subfolder, or 2 characters including letters (for even character density in the name).

A
Alexey Prokhorov, 2012-05-06
@megahertz

I solve this problem by generating a uinqueid directory in php (any other analogue will do) and further splitting the resulting value into subdirectories. The degree of fragmentation is determined by the frequency of adding new files. As a result, the resulting solution is acceptable for any actual file system.

K
kamaikin, 2015-04-30
@kamaikin

They searched badly on the internet ..... old articles ....
habrahabr.ru/post/70147
vkamaikin.ru/page/obrabotka-fajlov-na-servere

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question