chiark / gitweb /
(x86 asm): Zero the high parts of the ?MM registers if available.
There's a performance penalty to trying to preserve the upper parts of
the SSE/AVX vector registers, and it's pointless because we don't need
to preserve them. (Earlier AVX-capable processors would carefully snip
off the upper parts of the registers and put them in a box, and then
glue them back on when they were wanted, which isn't so bad. Later
processors instead just track the upper part of the register as an
additional operand, which leads to unnecessary latency.)
Add AVX-specific entry points to the necessary routines, and call them
when AVX is detected. This would all be easier if Intel had chosen
`vzeroupper' from an existing `nop' encoding space.
13 files changed: