Assembly stuff no comments
I’ve always admired Michael Abrash. If you don’t know who he is the man is sort of a legend. He helped develop the original quake engine, wrote tons of articles on how to get the maximum speed out of your pc in the early 90s when cycle counting used to be something respected.
I’ve always had interest in learning from him and reading his articles so I picked up courage and started reading assembly books for the 8086. I picked up a real nice one, Assembly Language Step-by-Step 2nd edition, and started studying.
I could say I “learned” this in college, assembly for the 8086 but it was a basic course and taught mostly the beginning stuff you know. mov this, add that, inc this, int 21h that. Just because I could understand the instructions from an individual point of view for me it was not enough, I wanted to get to the level Abrash talked about.
So I finished that book and have been reading and re-reading it as I go through Zen of Code Optmization and Write Great Code vol2.
I must say it’s being a bumpy ride, going back and foward with these books. Reading a chapter from one, going back to the other, while trying to understand what I just read. No wonder it takes time to master this sort of stuff.
Personally to me also it’s very satisfying to be able to read these sorts of books and be able to at least understand them. Gives a great sense of improvement. When I finish all 3 of them let’s see how I am.
But right now I want to share with you a small victory I feel I just had. One basic memory handling function in C is
memset( dest, val, size ) ;
I went inside it in it’s assembly code and managed to understand it. The most important instruction inside this function is
rep stosd
which is what causes the memory to be set once all the registors have been setup. Inside theres a bunch of checks for redundancies and type safeties, so I wrote the following that does that a memset and only the memory settting. No type checking, register juggling, no nothing. This is what I got:
mov eax, 10
mov ecx, 16
lea edi, dword ptr a
rep stosd
Which is basically what the Assembly book step by step teaches in one of its chapters. In the end this ends being 2 cycles faster than memset.
2 cycles faster. I am proud of myself ahah.
yes a small victory, but hopefully one of many to come. Let’s see how this goes.