Monday, July 4, 2011

Mesh Data Structure: Memory Efficient Data Structure

Well to be honest, the content is not as cool as the article :D Recently, I spent a few hours to test how to create a good data structure for a vertex information. So my current 'engine' (yeah, rite!) store vertices (and their attributes) with this following structure:
1. There's an array of faces(triangles). Each triangle holds the index of vertex, normal, and texture coordinate it uses.
2. There's an array of vertices. Each vertex holds the x, y, and z value.
3. There's an array of normals. Each normal holds the x, y, and z value.
4. There's an array of texture coordinate. Each coordinate holds the x and y value.


This called a Face-Vertex method. While this method also has an information about mesh topology, its not very inefficient. To draw the scene, the program loop through a triangle, check its vertex (v), normal (vn), and texture coordinate (vt), and draw it on the scene.

for(int i = 0; i < triangles.size; i++)
{
  tri = triangle.at(i);

  v1 = vertexArray.at(triangles.getV1Index());
  v2 = vertexArray.at(triangles.getV2Index());
  v3 = vertexArray.at(triangles.getV3Index());

  vn1 = normalArray.at(triangles.getVn1Index());
  vn2 = normalArray.at(triangles.getVn1Index());
  vn3 = normalArray.at(triangles.getVn2Index());

  vt1 = textureArray.at(triangles.getVt1Index());
  vt2 = textureArray.at(triangles.getVt2Index());
  vt3 = textureArray.at(triangles.getVt3Index());

  //draw scene here
}

Apparently this is not the most efficient way to loop through triangles. Accessing array randomly is not good, especially if you want to do that 30-60 times in a second, and this method access the array randomly a lot. First the vertex, normal and texture coordinate is kept on a different memory address. Jumping from one address to another adress takes time. Secondly the triangle vertex/normal/texture coordinate index is not sequential.

So I decided to reconstruct my drawing scene to an array friendly way, which means the vertex have to be sequentially accessed. One way to do that is to change the vertex-mesh data structure to independent version, or streaming meshes. Basically before you do the drawing process (preferably on the initialization for static/environment object), you manage the vertex list so it can be accessed sequentially without concerning triangles array. The code below shows how its done:

while (i < arraySize) {
  v1 = vertexArray.at(i++);
  v2 = vertexArray.at(i++);
  v3 = vertexArray.at(i++);

  //drawTriangle(v1, v2, v3);
}

To be able to do this, every vertex on every triangles has a single entry on vertexArray, ignoring the fact that there might be a shared vertex. This is way faster to loop rather than the first one. But it has weakness: it requires more memory than the first method, and there is no topology information about the mesh. Its also harder to remove vertex: we have to remove the vertex from the array while in the face-vertex, we can simply remove the face and deal with the vertex later.

The next optimization would be more related to C++ rather than computer graphic. Notice that in the method above we add i three times and use it as an index. This can be optimized furthermore. I did a small test with C++ using a single 3*N vertex array vs N triangle array. The difference of this triangles compared to the face-vertex method is this triangles actually contains the value of the vertex instead of its index. This is an example of a triangle class:

class Triangle {
  v3D v1, v2, v3;
}

where v3D has an explicit x, y, z values. Looping through the triangle would look like this:

while (i < triangleSize) {
  v1 = triangleArray.at(i)
  v2 = triangleArray.at(i);
  v3 = triangleArray.at(i);

  //drawTriangle(v1, v2, v3);

  i++;
}

On my machine, using 1,000,000 vertices, this method runs almost twice as fast compared to the one with vertexArray. Since I also need to keep other attributes such as normal and texture coordinate on the triangle struct. What I'm still unable to figure out though, some readings suggest that keeping the attributes sequentially is faster than not. It means if I want to accessv1, v2, v3, vn1, vn2, vn3, vt1, vt2, vt3, I have to keep it in that sequence too so I looks like this:

class Triangle {
  v3D v1, v2, v3, vn1, vn2, vn3;
  v2D vt1, vt2, vt3;
}

But it seems that on my test program, this type of structure has no effect performance-wise compared to if I keep it randomly, eg:

class Triangle {
  v3D v1, vn1, v2, vn2, v3, vn3;
  v2D vt1, vt2, vt3;
}

But it might be an architecture specific advantage so I can't confirm for other processor. I also notice there is no (significant?) difference between applying a structure compared to a class. So for the sake of OOP, lets just use class until further notice ;) However please be warned that this method dont have any topology information whatsoever. So it's safe and best to perform this only on static meshes yet.

PS: A little bit technical about this test program: built on OS X Intel Core Duo 2.0GHz Memory 2GB, time measurement compared using C++ clock() with clock ticks.

No comments: