M
M
Mikhail Shatilov2014-12-03 01:29:01
MySQL
Mikhail Shatilov, 2014-12-03 01:29:01

Is it a good way to store files on a server?

Task: a user without registration can upload and download their files. In the future, he can register and get additional functionality.
File storage idea.
There is a storage class that can write, read and store an array of files by key (md5). Structure:

/user_files/ef/eff7d5dba32b4da32d9a67a519434d3f.zip
/user_files/d5/d58e3582afa99040e27b92b13c8f2280.zip

There is a controller class that controls the storage class (with access control). Key generation occurs, for example, according to the following scheme: md5(user_id+file_name+...). The database stores user_id, original file name and key.
Download (given via php):
/user-files/download/d58e3582afa99040e27b92b13c8f2280/ -> Мой документ.zip

1. All files on 1 server.
2. High loads are not expected.
Are there any downsides to this scheme?

Answer the question

In order to leave comments, you need to log in

7 answer(s)
Z
zooks, 2014-12-03
@zooks

MD5 is not unique - theoretically they can be repeated.

V
Vitaly Peretyatko, 2014-12-03
@viperet

And "Download (given via php)" can be replaced with download via X-Accel-Redirect if nginx is installed

A
Alexander Wolf, 2014-12-03
@mannaro

Normal system.

P
Philipp, 2014-12-03
@zoonman

The scheme is not bad, but it's better to use sha1_file() to detect the same files. Those. first generate a checksum, then search the database for a duplicate, if there is one, check if the files match (this may take time, the background processing should be thought out). If the files are the same, then there is no point in keeping copies. In this case, the scheme in the database becomes a little more complicated, but disk space is significantly saved due to deduplication.
Metadata for each user must be stored individually and the link should be unique for each user.

A
Alexander Aksentiev, 2014-12-03
@Sanasol

Quite a normal system
I do almost the same myself. Only files "faceless". Title and meta in the database.
It is given through php, the folders themselves are naturally closed.

  • /uploads/05/03/2014/2a7db79344e0bfe58975de0b505310c6
  • /uploads/03.05.2014/37d7af3859d055fc8685334b03b052dd

A
Andrey Burov, 2014-12-03
@BuriK666

Quite nginx.org/en/docs/http/ngx_http_proxy_module.html#...

F
FanatPHP, 2014-12-03
@FanatPHP

Cons are standard.

  • For a million files, one level of nesting is enough, but for ten, problems will already begin when there are several dozen files in each directory. I remember about "no more lyama", but "640K of memory is enough for everyone ...", oh.
  • Known to be non-unique user filenames. Do not forget to add microtime to the hashed string.
  • Make sure that the hashed string does not have a common prefix (say, the full path to a file on disk) - then the first characters of the hash will be unevenly distributed, and fill the directories unevenly.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question