December 2006 Archives
In my porting Windows assembly to OS X write-up, I mentioned an easier way than actually combing through your code and making sure the stack is properly aligned. It turns out gcc has a nice compiler flag, called -mstackrealign with the following effect:
Realign the stack at entry. On the Intel x86, the -mstackrealign option will generate an alternate prologue/epilogue that realigns the runtime stack. This supports mixing legacy codes that keep a 4-byte aligned stack with modern codes that keep a 16-byte stack for SSE compatibility. The alternate prologue and epilogue are slower and bigger than the regular ones, and they require one dedicated register for the entire function. This also lowers the number of registers available if used in conjunction with the "regparm" attribute. Nested functions encountered while -mstackrealign is on will generate warnings, and they will not realign the stack when called.
So this sounds great! For a small performance hit, you can use your legacy code without modification. OS X doesn't use registers to pass parameters, so there's no conflict with regparm. However, I tried to use this option in my project, and it still crashed. Not on the movdqa instruction as before, but in seemingly random places. And only in Release mode. It sounds like this option doesn't play nicely with compiler optimizations, and just had to investigate.
With all Macs now shipping with Intel microprocessors, you'd think it would be easy to port assembly code written for Windows to OS X. Bzzt. Unfortunately, not. The problem lies with OS X's calling conventions on 32-bit Intel machines.
Calling conventions are agreed upon standards at the assembly level, dictating how registers and the stack are used for passing parameters and returning results. Basically, all code in a single program must agree on the same convention, otherwise you have a communication breakdown and bad things happen. You wouldn't want one function assuming a parameter is in one register calling another function, expecting it to be in another. Unfortunately, OS X deviates from standard Windows (cdecl) calling conventions.
I just read Quake 3 source code. This story is not exactly new, but it just got posted on Slashdot, and it's a facinating read. However, one particular comment irked me:
To all those wondering why John bothers to push out the source to id's game engines after the fact, the snippet of code at the very top of this article is a poster child for why. Not only do you get well-programmed and well-optimised 3D engines to modify and learn from, you get gems like the fast invsqrt function [...]
The source was released unde the GNU General Public License (GPL). This technically means that if you use this code, your code must also be released as GPL. So yeah, I can learn from this code, and I can even modify and use it an in another "free" application, but I cannot use it in a commercial-only application. To quote Bill Bumgarner:
Let me be utterly simplistic and blunt: The GPL is not a free license very much because it limits your freedom to do what you want with whatever it is that is under the GPL. Period.
Granted, this snippet of code has been traced back to pre-id origins over 15 years ago, and is probably considered in the public domain by now. And I'm sure id chose the GPL preciesly so that a competitor cannot snap up the code and use it as their own. As Sun just did with Java. I hate to be the license police, but it's important to understand the difference between all the open source licenses.
