T
T
TrueDrago2013-08-23 23:11:34
PHP
TrueDrago, 2013-08-23 23:11:34

Image upload optimization by url

I ran into the following problem, I would like to hear how it is implemented in similar projects

: contain links to images. These pictures need to be displayed in certain formats on your site.

Accordingly, now I have implemented the work with pictures as follows - they are loaded using copy in a loop, and then the cache with previews is warmed up. But it takes about 0.7 seconds to load 1 picture, which, with a large number of requests, does not affect the script execution time very well. In terms of ads, it turns out to be somewhere around 7 minutes per 100 entries, which is quite a lot.
What is the most efficient way to optimize the process?

While I'm looking in the direction of multi-threaded curl, but I would like to know what pitfalls this solution may have or what are alternative more efficient solutions?

Answer the question

In order to leave comments, you need to log in

7 answer(s)
A
Alexey Akulovich, 2013-08-25
@TrueDrago

One stream through curl_multi_init can download at least several hundred pictures.

S
Sergey, 2013-08-23
Protko @Fesor

Parallelizing download requests should help. Pitfalls - depending on what kind of pictures ... In theory, I can only assume that it may fall due to lack of memory, although I don’t think that 5-10 pictures will be able to collapse the script. We'll have to monitor memory leaks, etc.

E
edogs, 2013-08-24
@edogs

it takes about 0.7 seconds to load 1 picture
What are those 0.7 seconds spent on? Establishing a connection, dns resolving, jump, something else?
copy, if it is in the literal sense of the copy usual for http - the worst option. On dns resolving, for example, in some cases it can take 0.5 seconds, and it will happen every time you copy with a good probability.
Do curl, all jumps in one session and enable dns cache in curl options good in time. This alone can improve the result by an order of magnitude.
php-shny multicurl thing is still imperfect :( We would run something shell through exec, the same wget for example in several threads or something multi-threaded initially.

K
KEKSOV, 2013-08-24
@KEKSOV

To organize parallel loading and data processing, I use pcntl_fork, the main thing is to control the number of simultaneous workers, otherwise you will lay down the system. And as edogs said above

Do curl, all jumps in one session and enable dns cache in curl options good in time. This alone can improve the result by an order of magnitude.

S
stnw, 2013-08-23
@stnw

For example, previews do not have to be done right away. Maybe they will never be needed. I used LiipImagineBundle in the project , which generates a preview only at the immediate moment of accessing it. Or you can just go through another command later and generate them.
But here it’s more correct to debug your script and understand exactly where the bottleneck is and optimize it.

P
Pavel Volintsev, 2013-08-24
@copist

Look habrahabr.ru/qa/44278/#answer_171184

N
Nikolai Vasilchuk, 2013-08-23
@Anonym

The most obvious:
1. Curl upload pictures until they are loaded, show placeholders in their places on the site.
2. Resize images when uploading a picture (on first viewing), and not when uploading. This is how nginx can do it, or you can use php imagemagick to pull it.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question