I’ve been working on an algorithm to generate a set of polygons from a given image using CUDA and GLSL. This technique is generally called Depth Peeling [1,2,3]. To setup the data coming from and to textures, I’ve used the format GL_RGBA and type GL_UNSIGNED_BYTE. Obviuosly looking at it, one could imagine that color components is coming in four bytes in the order Red, Green, Blue and Alpha.
In this case, extracting the color components could be done using the following code:
int pixel = texture[i];
int r = ( ( pixel << 24 ) & 0xFF);
int g = ( ( pixel << 16 ) & 0xFF);
int b = ( ( pixel << 8 ) & 0xFF);
But, for my surprise, the components are retrieved in the inverse order. After looking for why this problem occur and also found others looking for it, as here, I found out the problem. In the code above, I was using integer type. In the Intel architecture, little endian is used, so the most significative bytes are allocated in the end. This is why we get the color components in the inverse order, even requiring some predefined order.
To outline this problem, I changed the type from integer to CUDA’s uchar4, solving the problem as bit ordering are applyed only in 16-, 32- and 64 bit word. After this change in the code, I was able to get the color components in the right order.
 Cass Everitt. 2001. “Interactive Order-Independent Transparency”.
 Trapp, Matthias and D”ollner, J”urgen. 2008. Real-Time Volumetric Tests Using Layered Depth Images.
 Liu, Fang and Huang, Meng-Cheng and Liu, Xue-Hui and Wu, En-Hua. 2009. “Efficient Depth Peeling via Bucket Sort”.