The current order is (y'_0, y'_1; y''_0, y''_1), (y'_2, y'_3; y''_2,
y''_3), but while this makes sense in the context of SSE2, it's not
really very satisfactory as a common currency. (In particular, if we
want to resolve the expanded factor into a value then we'll have to do
it by steam because the limb placements are irregular.)
Instead, fix the ordering in the test stubs so that the pieces come out
as (y'_0, y''_0; y'_1, y''_1), (y'_2, y''_2; y'_3, y''_3), which is
generally much better to work with outside of SSE2.
Of course, this only affects testing, not the actual code, so
performance is unchanged.