M
M
mrbaranovskyi2020-07-13 17:24:36
Processors
mrbaranovskyi, 2020-07-13 17:24:36

Linus Torvalds and last comments about AVX-512?

The news is buzzing on the net that Torvalds spoke quite harshly about the new instructions that will appear in the new version of GCC. We are talking about the new 512-bit AVX.
From what I understand, this set of instructions will reduce the frequencies a bit (seriously??). The main message of his comments was that companies need to improve on plain math, rather than add something so "terrible".

I don't understand why it's so terrible. For example, a 512-bit instruction is enough to perform over a 4x4 floating point matrix. Sounds very good, especially for those who are engaged in geometry processing.

I am not strong in the topic and the arguments are not very clear to me. Can someone explain in a nutshell what are the pros and cons?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
M
Mikhail Usotsky, 2020-07-15
@mrbaranovskyi

Frequency reduction is quite real. Such a long array on a chip chip is not easy to synchronize. It is for this reason that video cards, VLIW processors, and any computing node with parallel computing always operate at a reduced frequency. Although in practice, the manufacturer can go for tricks and tricks. For example, instead of calculating 8 elements at the same time, there will be four times two elements. It's like an example. There is a real example: Pentium 4, it has a 16-bit ALU inside. To calculate a 32-bit element, the ALU performs in two stages. But you can increase the frequency. This is what explains such a long pipeline.
And I somehow can understand the criticism of Torvalds. The AVX512 actually duplicates the existing ones by using longer registers. I also read different materials about. AVX512 is broken down and will be supported differently on different models. Can you imagine what kind of hemorrhoids compiler developers will have? Although, in general, the x86 / x86-64 instruction system has always been famous for being a mess (it's amazing that they still manage to execute commands quickly). I'm not even talking about the difficulties of developers of already final programs, like CAD, multimedia editors, and so on. It turns out that it is necessary to specify additionally which processors are needed. This greatly narrows down the choice. But he is right that it is better to improve the technology in multithreading. After all, not every day you will calculate a 4x4 matrix. What if you want to calculate a 5x5 matrix? Problem. And multithreading can always be controlled. For example, I always prefer multi-threaded implementation in development. Easier to design and control. When you don't need high performance, you can disable unnecessary cores. Yes. I forgot to add that an increase in instructions for x86/x86-64 inevitably leads to an increase in the size of micro-op memory that is in the processor.
And if you still need such a long operation, then it is better to transfer it to the accelerator of calculations. For example, video cards. They've been doing this for a long time. And their registers are very long. Can be implemented as an APU, like AMD. It's much more practical.

X
xmoonlight, 2020-07-13
@xmoonlight

Refer to the original source and do not play "broken phone" here.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question