D
D
Dima_Amigo2020-12-24 18:03:37
git
Dima_Amigo, 2020-12-24 18:03:37

How to speed up docker push?

I came across the fact that the docker push command works very slowly, and this probably makes no sense.

The command to download all dependencies `RUN npm install` is run in the Dockerfile. It downloads about 600MB of files and all this falls into one layer of the docker image, which in compressed form weighs about 400MB (not the point).
Next, this layer is loaded onto a private docker registry.

Question: even if at least one file in the dependencies changes, docker will again create and try to load a new 400MB layer. Maybe there are ready-made solutions so that docker push / pull does not download 400MB at the slightest change? Maybe there is another utility that replaces docker push / pull and internally stores the contents of individual layers in the git?

UPD:

Imagine if we could change the implementation of the docker push/pull commands.

The alternative docker push takes the desired image and places it via docker save into an archive. This archive needs to be unzipped along with all the layers inside and pushed into a special git repository.

An alternative docker pull downloads the desired image from a specific branch or tag in the git, builds back the archive from it, and uploads the image to docker via docker load.

It should work faster in time, because only those files that have changed will be downloaded.

There are also disadvantages in this approach: each git fetch is forced to load absolutely all new changes (all versions).

You can think about bicycles in this direction (maybe rsync instead of git), but suddenly there is already a ready-made solution ...

Answer the question

In order to leave comments, you need to log in

3 answer(s)
L
Lynn "Coffee Man", 2020-12-24
@Lynn

You can use multi-stage build
Like this:

FROM node:lts-alpine as build
WORKDIR /app
# копируем исходный код
COPY ./ ./
# устанавливаем зависимости
RUN npm ci
# Компилируем приложение
RUN npm run build
# оставляем только production зависимости
RUN npm ci --prod
# опционально удаляем из исходников всякие тесты, ts-файлы и т.п.

FROM node:lts-alpine as app
WORKDIR /app
# копируем сначала node_modules
COPY --from=build /app/node_modules ./node_modules
# а потом всё остальное
COPY --from=build /app ./
# и запускаем
CMD [ "node", "/app/app.js" ]

As a result, if the package-lock.json file does not change (i.e., the sales dependencies do not change), then the layer created in the command COPY --from=build /app/node_modules ./node_moduleswill be taken from the cache. And this is all our 400 megabytes of dependencies.
living example

Это второй запуск после изменения приложения и версии в package.json, но без изменений зависимостей. Тут видно, что шаг 4 сборки честно выполнился, но при этом шаг 9 (копирование node_modules) взят из кэша, т.к. папка node_modules не поменялась.
$ docker build .
Sending build context to Docker daemon  6.656kB
Step 1/11 : FROM node:lts-alpine as build
 ---> 1c342643aa5c
Step 2/11 : WORKDIR /app
 ---> Using cache
 ---> 01d641ac9d8b
Step 3/11 : COPY ./ ./
 ---> 2a369bda0312
Step 4/11 : RUN npm ci
 ---> Running in 5becac2f9f07
added 2 packages in 1.914s
Removing intermediate container 5becac2f9f07
 ---> c010ba772a08
Step 5/11 : RUN npm run build
 ---> Running in de6fd7f872a5

> [email protected] build /app
> echo building app

building app
Removing intermediate container de6fd7f872a5
 ---> dc80bc125954
Step 6/11 : RUN npm ci --prod
 ---> Running in 825f86a54af5
npm WARN prepare removing existing node_modules/ before installation
added 1 packages in 0.079s
Removing intermediate container 825f86a54af5
 ---> a00a029b86dc
Step 7/11 : FROM node:lts-alpine as app
 ---> 1c342643aa5c
Step 8/11 : WORKDIR /app
 ---> Using cache
 ---> 01d641ac9d8b
Step 9/11 : COPY --from=build /app/node_modules ./node_modules
 ---> Using cache
 ---> 81d587ccf147
Step 10/11 : COPY --from=build /app ./
 ---> bb40061f06b6
Step 11/11 : CMD [ "node", "/app/app.js" ]
 ---> Running in e6b9e08d9d8f
Removing intermediate container e6b9e08d9d8f
 ---> eceb38619009
Successfully built eceb38619009

V
Vitaly Karasik, 2020-12-25
@vitaly_il1

I am a practical person, so I do not believe that I can invent a bicycle that
1) will work fundamentally better than the existing ones
2) will become generally accepted
. Therefore, I am for multi-stage build, plus you can separate "permanent" and "variable" packages into separate layers.
Usually there are 80% of the 3rd party packages, they are permanent, and a few of their own that change.

S
shurshur, 2020-12-25
@shurshur

It's impossible. Basically. The layer is uploaded to the server only as a whole, you cannot check that only one file is different in it. Moreover, not only the file will be different, but also, for example, the directory in which it was placed - its mtime will change. All this is tracked through the checksum of the layer archive, it changes from any new file.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question