Posted: 2012-05-13 - Link

Here’s some SSE code to normalize a 3-vector:

; vector in xmm0
movaps  xmm2,   xmm0
mulps   xmm0,   xmm0
movaps  xmm1,   xmm0
shufps  xmm0,   xmm0,   _MM_SHUFFLE (2, 1, 0, 3)
addps   xmm1,   xmm0
movaps  xmm0,   xmm1
shufps  xmm1,   xmm1,   _MM_SHUFFLE (1, 0, 3, 2)
addps   xmm0,   xmm1
rsqrtps xmm0,   xmm0
mulps   xmm0,   xmm2

Here’s some SSE code to normalize four 3-vectors:

; x's in xmm0, y's in xmm1, z's in xmm2
movaps  xmm3,   xmm0
movaps  xmm4,   xmm1
movaps  xmm5,   xmm2
mulps   xmm0,   xmm0
mulps   xmm1,   xmm1
mulps   xmm2,   xmm2
addps   xmm0,   xmm1
addps   xmm0,   xmm2
rsqrtps xmm0,   xmm0
mulps   xmm3,   xmm0
mulps   xmm4,   xmm0
mulps   xmm5,   xmm0
; x's in xmm3, y's in xmm4, z's in xmm5

As you’ve no doubt noticed, normalizing four vectors in the right format is just about as fast as normalizing one vector.

This is nothing new. It’s fairly well known that using structure of arrays opens up more optimization opportunities than arrays of structures.

The interesting thing here is that there is no API with a function called normalize(vec3 v) that can take advantage of this. There is no way to break down the problem of normalizing four vectors in a way that recombining the parts is as efficient as the whole.

Holism is the idea that systems should be viewed as wholes rather than the sum of their parts. The opposite of Holism is Reductionism. A Reductionist, when presented with the problem of normalizing 100 vectors would first find a way to normalize one vector, then apply that 100 times. A Holist would see the problem as a whole, and realize that the normalizations can be done in parallel. The Reductionist would use the first code snippet. The Holist would use the second.

I like this quote from Stepanov on the subject:

It is essential to know what can be done effectively before you can start your design. Every programmer has been taught about the importance of top-down design. While it is possible that the original software engineering considerations behind it were sound, it came to signify something quite nonsensical: the idea that one can design abstract interfaces without a deep understanding of how the implementations are supposed to work. It is impossible to design an interface to a data structure without knowing both the details of its implementation and details of its use. The first task of good programmers is to know many specific algorithms and data structures. Only then they can attempt to design a coherent system. Start with useful pieces of code. After all, abstractions are just a tool for organizing concrete code.

If I were using top-down design to design an airplane, I would quickly decompose it into three significant parts: the lifting device, the landing device and the horizontal motion device. Then I would assign three different teams to work on these devices. I doubt that the device would ever fly. Fortunately, neither Orville nor Wilbur Wright attended college and, therefore, never took a course on software engineering. The point I am trying to make is that in order to be a good software designer you need to have a large set of different techniques at your fingertips. You need to know many different low-level things and understand how they interact.

I think the take-away point here is that you should consider your problem as a whole before you begin breaking it down into smaller parts. It’s tempting to start creating a separate class for every noun in your application domain, but more often than not there is a better design involving a more holistic approach.