First Shader and why I will post useless shit from now on
By J.Raza On August 31st, 2011I’ve been researching shaders in my spare time after work. Below is my first stab at writing one from scratch.
you can download it here : First Shader (74) Note : only works with graphics card that support OpenGL 2.0 and up.
It’s a simple little project, but there’s a special reason why I’m sharing it… Allow me to elaborate.
I recall back in my college years my game development programs attempts. I was using DX7 back then. Those attempts were quite crude, all they showed was simple sprites walking along the screen. I had several little projects like that, but the most infamous one, which became notorious even amongst my college friends, was the following : loading a bitmap file and rendering it on-screen using Direct Draw. The image file I was trying to load? Jester’s cap
Believe it or not, it took me a while to complete that project. It’s one of the reasons why it became so infamous. Now don’t get me wrong, there are several libraries out there that do that for you. Java and XNA have even built-in functions just for that. To this day, after having finished the project, I recall having a friend saying “took you so long to program just that? Man in Java I can code to load that image and render it in less than 3 minutes!”
Now my friends and commentators, they were all right — Java, XNA or using a middleware solution would have been a finer choice for the goal of rendering an image on screen. But that wasn’t my goal. My goal was in actually understanding how these frameworks internally worked Good developers knew how, and my plan was in not lagging behind them. I wasn’t actually just trying to render the card and finish a project. What I was doing was trying to become a better coder.
The reason why it took me forever to render Jester’s cap in DDraw was due to newbie mistakes — I wasn’t checking memory bounds properly, I wasn’t initializing variables properly, and so on. For a while I was stuck in the last part, where the pixel alignment was wrong somewhere in the code. I’ll never forget the day where I just simply struttling back to my house after a college class and it hit me out of nowhere : the image is 24 bits per pixel, and my render buffer is 32!
I quickly ran home, fixed the code and got it working. I recall raising my clenched fists in the air out of sheer excitement. Regardless of how simple the project was, I had risen to the challenge I had set out for myself.
I bragged to my friends that I had accomplished such a feat, and was even thinking about publishing it somewhere on the net. But that’s when I became shy of the jester’s cap project. I said to myself “it’s the most stupid application in the world. It shows no major accomplishment any experienced developer could do. It’s just rendering an image on screen.” I was ashamed of sharing something that albeit in a smaller scale was a personal achievement, was in a grander scale quite silly.
So I didn’t share it, and with the PC meltdown I had in 2006, I ended up losing it. It wasn’t a big deal to me anyway back then, I had moved on to do other projects such as Solis. But now that I look back to it I feel sad for having lost that simple program that eventually lead me on a long road…
So after saying all of that, that is why from now on I shall share things on this blog regardless of how stupid, silly or pretentious they might be. Hell it’s my blog anyway. However, there’s a grander reason on why I’m doing this… Following up on my sculptor vs. mason article, I think there’s another character trait of sculptors that I need to exercise — Sculptors aren’t afraid to make mistakes.
And so that leaves us with the above shader… it’s up for grabs and shall be like so forever. It’s a simple, stupid shader, but by Nascimento I am proud of it!
A crude attempt to reverse engineer a Super Mario World glitch
By J.Raza On February 4th, 2011Every now and then I remember playing the games I used to play as a child. One of those games was Super Mario World for the SNES. The reason why I bring this up is because of this very cool trick you could pull off in the Forest of Illusion level. With it, a semi-skilled player could go from 0 to 99,999,990 points in a few minutes.
Here’s the video displaying you how to do it, it’s pretty neat:
Now as you watch it, you can see Mario jumping from Wiggler to wiggler, and while he does that he gets points and extra lives. Here’s the screen shots of his points:
And as continues to bounce, the lives he gets:
However on the 7th wiggler jump he gets this symbol:
And then this one:
And so on and so on:
When I was 9 years old and saw those symbols for the first time, I had no idea what they were. The manual where I read this secret trick from said they were “secret super symbols that show how awesome you are!” But after reviewing this video a few times, I think I have a different opinion on what it might be: an offset pointer in video memory going haywire.
To understand what I mean by that, let’s have a quick discussion. In classic consoles programming, for an image (normally known as sprite) to be drawn into the screen, it’s done in the following sequence:
- First you must load the sprite into memory. Sprites are usually stored in memory sequentially, so images are grouped together (for caching purposes). This also has the benefit of making animation algorithms easier to implement. Here’s how sprite memory for the NES Zelda looks like:

This picture is taken from http://randomhoohaas.flyingomelette.com/RandomHooHaas-ripsprites.htm - You then use an offset pointer to inform the sprite rendering engine what part of memory to draw onto the video memory. This offset is usually in bytes. If you translate bytes to picture offsets, you are then able to ‘crop’ pictures and render individual sprites onto the screen. Assume I did that with the previous picture, and made an algorithm that given an (x,y) coordinate would give me the cropped out sprite. So for (4,3) I’d get the image below:

- This cropped image is then copied into the video memory, which it in itself is then read by the output signal to a PAL/NTSC scan line renderer. The result is an image drawn onto the TV screen.
Given that condition, here’s one of the most basic animation algorithms for sprite based rendering. You’d load a sprite such as this:
Into memory and you’d use the sprite render pointer offset to interpolate between the frames. So you’d start with coordinates (0,0), then after a few fractions of a second you’d switch to (1,0), then (2,0) and so on. Via this method you’d get an animation such as:
Given that technical description and the above Super Mario World glitch, here’s my assumption of what went wrong:
- The game had a counter (let’s name it the jump_counter) that started at zero. This counter tracked how many times the player jumped on a “bouncy” object (such as the wiggler), without touching the ground.
- As the player jumped on “bouncy“ objects, the game would use the jump_counter to track:
- The bonus the player would win: points and/or extra lives
- The associated image for such bonus.
So say the player jumped on a wiggler. The jump_counter would increase to 1 and render 1000 points. If the player jumped on another wiggler, the jump_counter would go to 2, and render 2000. As the player continued to “bounce” around, this process would continue.
So my assumption is that the algorithm was basically:
if( player_jumped_on_bouncy_object() )
{
Jump_counter++ ;
Points_variable += power_of_two( jump_counter ) * 1000;
player_lives += ( jump_counter > 4 ) ? jump_counter : 0 ;
ImageHandle Image_to_render = memory_offset_via_jump_counter(jump_counter);
Render_image( get_player_bounced_pos(), image_to_render ) ;
}
Now let’s explain the above pseudo-code:
- Perform this algorithm only if the player bounced on something.
- We increase the jump_counter, which is the base counter for the entire algorithm.
- We increase the number of points, based on 2 ^( jump counter ) * 1000.
- If the player “bounced” on more than 4 objects in a row, he’d get an extra life per bounce after that.
- We retrieve an imageHandle that is relative to the jump_counter variable. So if jump_counter is 1, the ImageHandle returned the 1000 point image. If it’s 2, the 2000, and so on.
- We then render the image we just retrieved at the location where Mario bounced.
Simple, isn’t it? However there’s a bug in the above pseudo-code. We can only retrieve valid image handles when the jump_counter’s range is from [1,7]. What if we make a request when jump counters is equal to 8 and above?
And that’s the catch! We get an invalid pointer offset to somewhere else in memory! Since that memory area is used for what’s to be displayed on the screen, we get bits and pieces from other sprites.
Obviously this is all an assumption of what went wrong. I could be incredibly incorrect in my analysis, but at least it’s worth a shot. What are your thoughts on this one?
Step by Step fast algorithm construction in native code
By J.Raza On January 28th, 2011A few weeks ago I saw this forum question where a user asked help on making an algorithm that:
“replaces all the vowels in a string with the character ‘*’”
The other users in the forum quickly replied and helped him with that question. For some reason though, that problem stuck to my head. Sure it’s a simple algorithm, but I thought what if I added a new constraint: It has to be done fast, really, really fast.
To me this created a whole new approach to the problem, and I was interested in seeing how far I could take it. So, for starters I took the best reply to the forum thread from user Dominuz (who himself stated that he focused more on clarity and ease of understanding rather than optimization )
Below is his sample code:
#include <stdio.h>
char vogais[] = {'a','e','i','o','u'};
void subst(char* nome)
{
for( unsigned int i = 0; nome[i] != '\0'; ++i )
{
for( unsigned int j = 0; j < 5; ++j )
{
if( nome[i] == vogais[j] )
{
nome[i] = '*';
break;
}
}
}
}
int main()
{
char criatura[] = "12 Criatura aeiou 123456";
printf( "%s \n", criatura );
subst( criatura );
printf( "%s \n", criatura );
return 0;
}
For starters, I knew a simple text string would not be enough to measure different algorithms performance. So I decided to take a larger text as input, in this case Michael Jackson’s Wikipedia page as a tribute to the late king of pop. The total file was 273kb.
I modified Dominuz code to open the file, perform the subst function a thousand times, while recording how long it took to do that. The result was stored in a text file. Here’s my initial modification of his code:
#include <iostream>
#include <fstream>
#include <Windows.h>
using namespace std;
char vogais[] = {'a','e','i','o','u'};
void trunc_double()
{
fstream file( "out.txt", ios::out | ios::trunc ) ;
file.close() ;
}
void save_float( double d )
{
char buffer[256] ;
fstream file("out.txt", ios::out | ios::app ) ;
sprintf_s(buffer,"%.15f\n", d ) ;
file.write( buffer, strlen(buffer) ) ;
file.close() ;
}
char* malloc_and_setup_buffer()
{
char *b = NULL ;
int b_size = 0 ;
fstream file ;
file.open( "mj.txt", ios::in | ios::binary ) ;
if( !file )
return NULL ;
file.seekg( 0, ios_base::end ) ;
b_size = (int) file.tellg() ;
file.seekg( 0, ios_base::beg ) ;
b = new char[b_size];
file.read( b, b_size ) ;
file.close() ;
b[b_size-1] = '\0' ;
return b ;
}
void dealloc_buffer( char* b )
{
if( !b )
return ;
delete[] b ;
}
void subst(char* nome)
{
if( !nome )
return ;
for( unsigned int i = 0; nome[i] != '\0'; ++i )
{
for( unsigned int j = 0; j < 5; ++j )
{
if( nome[i] == vogais[j] )
{
nome[i] = '*';
break;
}
}
}
}
int main()
{
LARGE_INTEGER lg0, lg1, frequency ;
QueryPerformanceFrequency(&frequency) ;
trunc_double() ;
for( int i=0; i<1000; i++ )
{
char* b = malloc_and_setup_buffer() ;
if( !b )
return 0;
QueryPerformanceCounter(&lg0) ;
subst( b );
QueryPerformanceCounter(&lg1) ;
float dt = (float)(lg1.QuadPart - lg0.QuadPart)/(float)frequency.QuadPart;
save_float(dt) ;
dealloc_buffer( b ) ;
}
return 0;
}
As you can see, I’m only worried on how long it takes to run the subst function. After putting this data into excel, I got that the average for sample one is 0.002205584 seconds. From that basic setup, I started the optimizations.
A quickly saw that you could force inline the subst function, making a call to it faster. However the biggest issue I had with it was accessing ‘vogais’ as a global variable, instead of a local function variable.
What’s the big deal about that? Well when you reference a global variable you are referencing an area of memory, contrary to when you access a local variable, which implies you are referencing the stack. And in terms of access speeds, stack beats memory.
So after re-implementing subst I got this:
__inline void subst(char* nome)
{
if( !nome )
return ;
char vogais[] = {'a','e','i','o','u'};
for( unsigned int i = 0; nome[i] != '\0'; ++i )
{
for( unsigned int j = 0; j < 5; ++j )
{
if( nome[i] == vogais[j] )
{
nome[i] = '*';
break;
}
}
}
}
After running it again and calculating the average I got 0.001782024 seconds. Not bad, I got a bit of an improvement over the previous algorithm. It still wasn’t enough to say I had a significant impact though.
If you look inside subst we have two loops, one iterates through each letter while the other iterates through each vowel. Well loops translate into assembly as jump instructions and those are expensive. So in order to get rid of jump instructions I performed a technique called ‘loop unrolling’. Below is the result:
__inline void subst(char* nome)
{
if( !nome )
return ;
for( unsigned int i = 0; nome[i] != '\0'; ++i )
{
if( nome[i] == 'a' || nome[i] == 'e' || nome[i] == 'i' || nome[i] == 'o' || nome[i] == 'u' )
nome[i] = '*';
}
}
Its average was 0.001161438 seconds. Not bad! Almost twice the performance when compared to the first one. But I wasn’t ready to finish yet.
You see my big issue with this model is that we are performing five vowel comparisons per letter. This translates to several compare and jump instructions in assembly and that’s just slow. I had to think of a way of getting rid of those comparisons.
After giving it some thought, I found a way out! I used a lookup table. Since bytes can only have 256 different values, I created a lookup table with 256 bytes in size. All bytes were set to 0, except indexes 97, 101, 105, 111 and 117, who were set to 1. What’s special with those values? Well they are exactly the indexes for the vowels a, e, i, o and u in the ascii chart.
I thus use the letter itself as the index in the lookup table, which indicates if it’s a vowel or not. Here’s the code:
__inline void subst(char* nome)
{
if( !nome )
return ;
char table[256] ;
memset( table, 0, sizeof table ) ;
table['a'] = 1 ;
table['e'] = 1 ;
table['i'] = 1 ;
table['o'] = 1 ;
table['u'] = 1 ;
for( unsigned int i = 0; nome[i] != '\0'; ++i )
if( table[ nome[i] ] )
nome[i] = '*';
}
The result? 0.000951169seconds. Not bad, managed to go in under 1 microsecond.
I kept thinking though that there was something else that I could add to this code… somehow I was missing something obvious. After a few minutes it hit me! I could do loop unrolling again and perform even less jump instructions!
After some testing, I decided to unroll the loop 4 times. For that I had to split the loop into two parts. The first part would have to iterate all the way up to the closest multiple of four, but not greater than the string’s size. The second loop I would need to take care of the last 3 potential characters left in the text
Now to reach the closest multiple of a number using integer arithmetic you do:
Number = ( number * multiple ) / multiple
Due to the nature of integer division, since it naturally rounds down the numbers, we reach our value. The problem though is that multiplication and division are expensive and should be avoided in fast algorithm construction. Is there a way to get rid of them?
Well yes! We just have to use bitwise operations. I didn’t just choose to unroll the loop four times for no reason. Since four is a power of two, it thus has the following property: adding it to a number, then performing the bitwise AND with its negative bitwise gives the closer or greater multiple of that number. We just subtract the number once from the result and alas, we have the closest or lower multiple of four from the original number.
So finally here’s the result:
__inline void subst( buffer_data bd )
{
char *b = bd.b ;
char table[256] ;
memset( table, 0, sizeof table ) ;
table['a'] = 1 ;
table['e'] = 1 ;
table['i'] = 1 ;
table['o'] = 1 ;
table['u'] = 1 ;
int i ;
const int upper_s = (bd.size_b + 3) & ~0x03 - 4 ;
for( i = 0; i < upper_s ; i+=4 )
{
if( table[ bd.b[i] ] )
bd.b[i] = '*';
if( table[ bd.b[i+1] ] )
bd.b[i+1] = '*';
if( table[ bd.b[i+2] ] )
bd.b[i+2] = '*';
if( table[ bd.b[i+3] ] )
bd.b[i+3] = '*';
}
for( ; i < bd.size_b ; i++ )
{
if( table[ bd.b[i] ] )
bd.b[i] = '*';
}
}
The result is then 0.000845947 seconds. I was almost ready to settle with it but I remembered one last detail: cache.
I’m using a 256 size lookup table. But I only care about the alphabet characters in the ascii chart. I could thus reduce the table to 32, which is the closest power of two multiple from 23. The bright side of having a smaller lookup table is that we manage to maintain it longer in the CPU’s cache. Less cache misses, faster algorithm. So here’s the code with that in mind:
__inline void subst( buffer_data bd )
{
char *b = bd.b ;
char table[32] ;
memset( table, 0, sizeof table ) ;
table['a'-97] = 1 ;
table['e'-97] = 1 ;
table['i'-97] = 1 ;
table['o'-97] = 1 ;
table['u'-97] = 1 ;
int i ;
const int upper_s = (bd.size_b + 3) & ~0x03 - 4 ;
for( i = 0; i < upper_s ; i+=4 )
{
if( bd.b[i] < 97 || bd.b[i] > 122 )
continue ;
if( table[ bd.b[i] - 97 ] )
bd.b[i] = '*';
if( table[ bd.b[i+1] - 97 ] )
bd.b[i+1] = '*';
if( table[ bd.b[i+2] - 97 ] )
bd.b[i+2] = '*';
if( table[ bd.b[i+3] - 97 ] )
bd.b[i+3] = '*';
}
for( ; i < bd.size_b ; i++ )
{
if( bd.b[i] < 97 || bd.b[i] > 122 )
continue ;
if( table[ bd.b[i] - 97 ] )
bd.b[i] = '*';
}
}
And the final result is 0.00084498. Not much faster from the previous algorithm but a lot faster when compared to the first one. In fact it’s over two and half times faster.
Now I’m sure that if I kept at it I’d find even better ways to optimize this algorithm, but I’m settling with what I got for now. With this exercise I just wanted to prove a few points:
1. Knowing computer architecture and how code translates into assembly can be quite useful.
2. Like Michael Abrash pointed out, there’s no such thing as the fastest code in the west.
3. Knowing bitwise and pointer arithmetic helps as well.
4. I should have a better social life.
Well I guess that’s it for now! Click here: Fast Vowel (87) to download the sample code and take a look at it yourself. Do keep in mind that I only tested this in one machine and in different environments the results may vary.
The comparisson table:
- Dominuz code: 0.002205584
- Local variable + inline : 0.001782024
- vowel loop unrolling : 0.001161438
- 256 lookup table : 0.000951169
- 256 lookup table + loop unrolling : 0.000845947
- 32 lookup table + loop unrolling : 0.00084498
Reading, re-reading and groking
By J.Raza On August 7th, 2010We’ll it took a while, but I managed to finally finish reading Michael Abrash’s Graphics Programming Black Book:
In some ways I feel ashamed to say that I ‘read’ it because it’s a behemoth of knowledge spanning over 1200 pages. Of course I learned a lot from it but it’s the sort of book I’ll read again and again and again until I can finally grok it. It’s a collection of over 10 years of Abrash’s papers and I doubt one can absorve it in a matter of months.
You see to learn programming concepts in a self taught manner I think it’s crucial to not only read the code in the book, but also write it down, play around with it, to truly understand what is being taught in its finer details. With my current project I intend to do just that, since it’s a FPS and the last chapters on the book concern directly with Quakes development at id Software, where Abrash worked.
What’s more interesting is that the author doesn’t focus only in the development aspect of programming, but in the general mentality of it. Not as one solves a problem, but the mind that solves it. As you become better in development I think you ask yourself less “how to solve this problem” but more of “what is the best way to solve this problem”. Abrash shows us several ways to solve a problem in the book, be it linked lists, spatial visibility or making a faster game of life, each one consistently faster than the other with either assembly optmizations, algorithm optmizations or rethinking the whole approach to the problem. The idea is to not expect that there is only one way to handle an issue. In his own words : “Assume nothing”.
The book is also quite pleasant to read, since the author narrates the development cycle more as a journal than a tech book. It’s quite interesting to read the last chapters where he focus on making a faster rendering back-to-front polygon rendering approach to Quake. Almost goes like this:
March 14, 1941. We begin our approach to the BSP tree, were still having heavy losses on how to figure out a way to make the spatial visibility problem faster. The worlds we want with Quake feature at least 5000 polygons and in the worse case scenario we redraw each pixel 5 times. It’s too slow, we must take a better approach.
May 22, 1941. We sucessfully managed to create a potentially visible set (pvs) that managed to break into enemy lines. We will now proceed to use it to flank their defeneses.
June 10, 1941. We have now conquered the enemy’s battlefield. I’ve reduced the inner loop of the rasterizer to 2.5 cycles per texel. We decided to use z-buffering for drawing the enemy meshes, since it’s faster and not that big of a problem as we expected. Victory is eminent.
And so on. Overall the book can be divided into 3 parts:
- General assembly optimization techniques
- 3D rendering done via software
- Common 3D engine development problems and solutions.
I recommend it to anyone that’s interested in taking game development or programming in a seriously yet elegant manner. I learned a lot from it, and still intend to learn more.
Slow Agile Development
By J.Raza On July 19th, 2010While riding the subway today I saw a man my age holding a book titled “Agile Software Development, Principles, Patterns and Practices“. This reminded me of a few opinions I had about software enginnering. I talked on a previous post that it’s important to take into consideratoin the final product that’s going to be developed in order to choose a proper software Engineering methodology.

Today I’d like to keep talking about that but focusing more on the technological nature of that which is being developed. Here’s a short story to better explain why:
While I was working on Conira, early on I decided just for the hell of it that this game would be multithreaded. Given my previous experience developing the Solis engine I saw that you could dissect the game engine into three threads.
- One that renders all the objects in the game, be it sprites, particles or models. This thread uses the ‘get’ methods each object had.
- Another that updates all the objects in the game, their displacement, acceleration, hit detection, etc. This thread uses the ‘set’ and ‘get’ methods each object had.
- One thread that takes care of loading/unloading objects and handling the Operating System calls like windows and messages.
Initially I thought this would be a good design. I elaborated on how the threads would communicate, tried to figure out and reduce potential deadlock /livelocks and set off to code!
It wasn’t but one day later that I had to redo the whole design.
You see there is a technical limitation when dealing with rendering systems, threads and the Win32 specification. MSDN clearly states that “the same thread that RENDERS the objects MUST be the same thread that CREATES the WINDOW and handles its OPERATING SYSTEM calls.” In other words, thread 1 and 2 had to be same. In the end it wasn’t a big alteration, meerely trivial given the design I had but it did bring me into re-thinking certain software engeneering practices people take for granted.
When I was in college I was basically told that “code development is a final product of a series of design decisions”. You figure out your system first, its classes, structures, layers and then the code you write is merely an extension of that. In other words, take the broad view of a project and as you develop it, handle the inner details and its intricancies as they come. But that type of methodology fails to work in development environments where those same inner details and intricancies are what actually defines the system as whole.
I think Agile Software advocates say the same thing but with different words. “Work on the code first, let the inner details of a system be delt as they come, so don’t worry about documentation, design, etc. Code, code and code!”. They also usually point cases where such methodologies worked. However they fail to point out the architecture of the system in which they developed. Most, if not all, of those success stories I heard were either web applications or with an interpreted language such as Java or C#.
Well it makes sense for those projects to work on those scenarios! They were built on top of languages where it is expected to delegate the inner details of a system to whomever is handling it. So all they have proven is that those agile methods work on those environments, which is ok, but not that agile methods work as a whole, which is what they usually advocate. Honestly, I’m yet to see an long run story of a sucessfull low level driver or high optmized high-level assembly language project using Agile methodologies.
Don’t get me wrong, I’m not saying that Agile methods suck or that they don’t work. They do and can work quite well but one must understand the context in which they work, or else failing into the trap of beliving that they work in any environment.
If one doesn’t take into consideration the end product as well as it’s internal architecture while taking enginnering decisions, he’ll end up eventually just making bad or lucky engineering.
Planning vs. Common sense
By J.Raza On April 15th, 2010Today I’ll review another book I read, Software Engineering for Game Developers.

As a book that concerns itself with software engineering I say it does the job. It’s an 800 page beast covering topics from UML, resource management, project risks, stipulations and damage control. More interestingly enough it tries to tie that with actual game development. It goes quite in-depth in each of it’s topics and gives plenty of further references if any particular one interests you. Overall it’s a good book.
But that’s not really what I want to discuss about.
You see, the book comes with a game a small team of developers made using a variety of software engineering techniques. This serves to show how one can apply them into game development. The game is quite stable, performing well given the myriad of things attached to it (3d mesh loading, textures, events, GUI, scripts, etc). However there”s just one problem with it:
It’s not really any fun.
Now I’m sure the goal of the book is to teach software developers techniques they can use in their own development, with the game being fun not a pre-requisite in this scenario. It had to be functional, not fun.
That’s one of the things game development as a software differs from others. If a program manager assumes that a project will take:
- w hours of development
- take y developers to develop it
- cost z dollars
- have k use cases
And it ends up taking exactly that, most likely that was a successful project.
The problem with game development is that even if you manage to make a game with all your estimates correct, if the game in the end is not fun than you still have an unsuccessful project. Sure it can still sell well, but it’ll probably take tons of marketing to make up for it.
Some could say that fun is a non-functional requirement, which is true. However it undermines it’s importance into the actual game development process. Sucesfull game companies have long realized this and build entire systems of software enginnering whose sole goal is to enhance and facilitate adding ‘fun’ to a game.
Valve software uses their CABAL system, ID software with their endless internal engine prototypes, Blizzard with their QA tests, and so on.
Which leads me to conclude:
Sometimes the most important aspect of a project cannot be expressed in a process or in a form. That is still not a justification to leave it out off processes and forms.
Learning and re-learning
By J.Raza On April 4th, 2010Today I’ll review two books I read: Professional Assembly and Assembly Language Step-By-Step.


One interesting thing is that both approach the same topic, teaching assembly language, from different perspectives. Step-by-step takes care of explaining the history of Intel’s CPU architecture, from the segmented mode to flat mode, detailing the intricacies of segmented programming along the way. It takes over 150 pages just to get to the first line of assembly code in the book.
Professional assembly on the other hand is a rocket ship, taking no apologies and going full throttle into Intel’s assembly structures and the GNU’s assembler (gas) syntax.
Step-by-step takes care of explaining basic computer architecture so that in the end you can understand assembly and it’s logic. Professional assumes you know that and blasts off, which I think in the end makes it a better book.
You see I’ll probably go back to Professional Assembly when I need a reference or review a topic, because Step-by-step while good at explaining things, once you got them there ain’t much left to go back to. Now you may ask me, which one should you buy? My answer is:
Both.
You see, one thing that I learned is that it’s good to study a particular topic several times, even if you are already familiar with it. An authors approach, no matter how good it is, will not be universal. That’s because it’s his approach to the topic, there are other things that can either be better explained or better elaborated upon.
And the more you look and study at a particular topic, the more universal your approach to it will be.
Working with a team of one
By J.Raza On March 29th, 2010I was having a chat with my good friend this afternoon. He was having some trouble getting the XML parser on the iPhone to work and I tried to help him out where I could. I gave him the tip to always check for the xml error parser code, when something goes wrong, even if most likely nothing will.
He said that he was doing this project solo, and that such verification wasn’t necessary. That’s when I remarked “Two months from today you will be another person working on this project.” He got the joke and we both laughed.
You see even when you are working alone, as projects evolve, change focus or simply take time, you will forget about different assumptions you had previously taken. You will also forget about previous remarks you had in your code, which functions should be called, by whom and so on.
A solo project is still being developed by a team of individuals. They’re just separated by the 4rth dimension.
Book Review
By J.Raza On March 22nd, 2010I finish reading Write Great Code Vol2 yesterday. I must say it’s a good book, although with one complaint.
Let’s start with the good side. It’s very well written and throughly elaborate on explaining how your compiler turns the high-level language statements into low level assembly. It then goes on for hundreds of pages explaining how to optmize that, from function calls, arrays, structures to small if/else jump conditions, local, static and global variables differences. Basically covers most of the issues a programmer will face or would ask himself “how will this turn into assembly”.
The only complaint I have is not in the book itself but the author over and over and over and over again telling you that if you want to look further into a particular topic that the book doesn’t cover (like line caches) he states that you should get his other book, Write Great Code vol1. I don’t really have anything against good advertisement, but it just gets tiresome after the 10th time he does that.
Still It’s a good read and I enjoyed it. My copy of WGCvol1 is already on the way from Amazon and i’m pretty sure it’ll be a good read as well.















