U
U
un1t2016-05-25 11:04:33
Amazon Web Services
un1t, 2016-05-25 11:04:33

What is the best way to store millions of photos in S3 (Selectel)?

Share your experience. If I upload, say, 10 million 1MB images into one bucket. Will I have problems copying to another bucket for a backup? Or maybe some other problems will show up.
And what is the best way to organize? All in one bucket or for every N photos do a new bucket?
I am now using an analogue of S3 from another hoster, already with 100 thousand files I ran into problems. The list operation takes more than 10 minutes to complete. And sync, respectively, first makes a list. I'm afraid to imagine what will happen with a million files.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
S
spotifi, 2016-05-25
@un1t

Or a similar Cloud Storage technology in Clodo or Rackspace ...
But in general, all these technologies do not like to be overloaded. They are designed to guarantee other users access while you're streaming. Therefore, they reserve resources for others and do not give everything to you.
You can't bypass it.
Even if you raise your storage on a dedicated server using their technologies (Openstack Swift is open source, you can easily raise it, for example, using Ceph + Object Storage).
You can try pouring in multiple streams . This should help bypass reservations for other users.
You can upload multiple files in one request . For the same.
You need to make a copy not by external means, but through the API of this cloud storage. A long list can be partially bypassed by creating
subdirectories
: 100501003.jpg This will not reduce the total time. But at least it will allow you to split operations into separate atomic (by catalog) and execute them in parallel. Here's another idea. Here they also suggest using parallelism https://chris-lamb.co.uk/posts/uploading-large-num...

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question