What database (or other tool) to choose for the cache of downloaded images?

V

Valery2018-07-25 14:40:37

caching

Valery, 2018-07-25 14:40:37

Hello, tell me a database or some cache service for caching images downloaded from external resources?
Interested in something that can be set a limit on consumed RAM for hot data, and everything that does not fit would be stored on disk. In this case, you need the ability to set the lifetime of objects. Image size up to 1 Mb.
You need direct access to the repository from nginx so that these images can be directly referenced.
Permanence and high data retention is not required. Rebooted / everything was lost - quite satisfied.
The total RAM consumption is preferably not higher than 1GB. The planned total volume is about 100 GB.
It is desirable to run in Docker, but if anything, I'll figure it out myself.
S3 from Amazon is more or less suitable (a bundle with nginx, ttl objects and there are no messes with memory), but the project is not on AWS and using S3 from the outside is terribly expensive. I would like a self-hosted solution ideally.
Found https://minio.io/ but they don't have TTL objects.
Of course, you can install Memcached, but it does not know how to dump the surplus to disk (if it can, tell me how), and allocating 100GB of RAM is a little expensive. Redis doesn't do that either, plus it has issues with evicting old objects when it overflows.

Reply

Answer the question

In order to leave comments, you need to log in

4 answer(s)

M

m0nym, 2018-07-25
@m0nym

Aerospike In
general, I would take any database for metadata (and track cache decay) and use regular files to store the files themselves.
And if you have a one-time cache decay, if the lifetime is not extended, then it is even easier - the file name is equal to the date of its creation or the date of decay.

S

Stalker_RED, 2018-07-25
@Stalker_RED

CacheFS + ramdrive ?

A

Andrey K, 2018-07-25
@kuftachev

I may not quite understand the essence of the task, but why does the file cache of the OS itself not suit?
If you want object, but not expensive, you can look at Digital Ocean, they implement the S3 API.
Well, you can also take a separate virtual machine and give it the necessary resources, if persistence is not important. It is possible at the same DO.
In general, it seems to me that if you are not sure about the solution, you can describe the business task itself, maybe someone will tell you a completely different solution.

R

Roman Mirilaczvili, 2018-07-29
@2ord

You need to serve images through a caching reverse proxy.
Linux knows how to cache files from the file system. You just need to give her (OS) the opportunity to do this work. The OS itself will cache the most frequently requested files and will automatically release resources if necessary.
I believe that it is enough to configure Nginx and nothing else is needed.
Also worth checking out with Varnish, Squid, Apache Traffic Server.