Why is sub faster than add?

B

belk2012-12-06 11:25:22

assembler

belk, 2012-12-06 11:25:22

I measured the execution time of different instructions and found that the sub instruction is up to 2.5 times faster than the add instruction. Naturally, I did not believe it, I generated a bunch of tests.

Test Methodology:

The test consists of two files, .asm and .cpp.

Two procedures are declared in .asm: one consists of N instructions add rax, rax, the second one consists of the same number of sub rax, rax.

.cpp has a loop that randomly calls one of the two procedures M times (so that there are no patterns) and measures its execution time. As a result, for both procedures, the average number of ticks spent per instruction is displayed. Compiler optimization is disabled.

As N and M, all pairs from the list [10, 100, 1000, 10000, 100000] are selected, except for those in which N*M exceeds 10^8.

Each test is run 100 times, the result is averaged. Testing was carried out on an unloaded machine. If you look at the histogram, you can see that the subtraction gains a lot in speed. With what it can be connected? UPD: I replaced the operands with rax and rdx, everything fell into place. Thanks to all.

Reply

Answer the question

In order to leave comments, you need to log in

4 answer(s)

V

VenomBlood, 2012-12-06
@belk

sub reg, reg is a special case, processed by the processor in the same way as xor reg, reg, in this case the special “ranamer” block performs zeroing and removes the instruction from the pipeline even before it is executed in the engine.

C

cypok, 2012-12-06
@cypok

And try to use two different registers?

A

Alexey, 2012-12-06
@alexxxst

movrax, 0

D

dlinyj, 2012-12-06
@dlinyj

Questionable results, of course, but the explanation may lie in the following. In some RISC processors, there is no addition at all and only a subtraction instruction. Addition is achieved by inverting the result obtained with the NOT instruction. Perhaps the story is the same here, just the NOT command is already being made by the processor in its crystal, hence the delay.