Tuesday, October 25, 2011

Finding Mouse Position in World Coordinate

So recently I had trouble finding a proper way finding my mouse position in world coordinate. I'm currently trying to build a RTS-like interface where the player views the world from isometric view and uses mouse to give input to the game. There are a few options for doing this. One option is to develop a ray-tracing alike method to detect the ray from mouse point in screen space to the plane in world coordinate. In my opinion this option is not that effective, especially if the object consist of many planes.



The other way to do it, which is the most efficient method, is to use gluUnproject(...). As its input, the method requires the mouse position (x, y), the depth value of the pixel on the mouse cursor (z), all three matrices of the scene (viewport, projection, and modelview), and 3 variable to contain 3 results. The mouse position and matrices are pretty straightforward. The trick is to find a proper depth value. We can simply use glReadPixels(...) to read the depth but the problem is OpenGL does not use 'real' depth value. You can find the details here. So somehow you have to modify a bit so OpenGL render the real depth value.  I found this website useful to do this modification.

As for my program, I did something unusual to do this scene-to-world coordinate conversion. The idea is simple: we represent the world coordinate as an RGB. With the help of frame buffer object, this can easily be done. First I create a frame buffer object to contain my world coordinate value. Notice that this value need a floating point precision, we cant represent it as an 8 bit value like usual color. Instead we use a 16 bit value for each color component.




glGenFramebuffersEXT(1, &fb1);
glGenTextures(1, &tx1);
    
glBindTexture(GL_TEXTURE_2D, tx1);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA16, width, height, 0, GL_RGB, GL_FLOAT, NULL);
glBindTexture(GL_TEXTURE_2D, 0);
    
glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, fb1);
glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT, GL_TEXTURE_2D, tx1, 0);
glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, 0);

Now since I already have a FBO to keep my world coordinate value, I can simply render my scene into the FBO. Since I dont have to find the intersection of every object, I can only draw what I need in the FBO. 

glUseProgram(programObj);
glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, fb1);
glClearColor(0.0, 0.0, 0.0, 1.0);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
    
glColorMask(GL_TRUE, GL_TRUE, GL_TRUE, GL_TRUE);
gm.drawBase(); // I just need to find mouse-base plane intersection. 
    
glUseProgram(0);
glColorMask(GL_TRUE, GL_TRUE, GL_TRUE, GL_TRUE);
glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, 0);
    
glClearColor(0.0, 0.0, 0.0, 1.0);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
gm.draw();
glutSwapBuffers(); 

Notice that in the render to FBO, I use my own shaders so the RGB value can represent the x, y, and z value. Here is my vertex shader:



varying vec4 worldCoord;

void main()
{
   worldCoord = gl_Vertex;
   worldCoord.x = (worldCoord.x+400.0)/800.0;
   worldCoord.y = (worldCoord.y+400.0)/800.0;
    
   gl_FrontColor = gl_Color;
   gl_Position = ftransform();
}

and my fragment shader:

varying vec4 worldCoord;

void main()
{
   gl_FragColor = vec4(worldCoord.x, worldCoord.y, worldCoord.z, 1.0);
}




The vertex shader is quite simple. I send the gl_Vertex value instead of the screen space vertex value from ftransform(). Now since the width and height of the base plane on the scene is 800, and the plane start from -400 at x and y coordinate, we need to normalize the value so it goes from 0-1. Thats' why I added 400 to the value so it starts from 0, and divide it by 800 so it ends to 1.

After that I find out my current mouse position and then find the world coordinate value on that point by converting the RGB value to x, y, and z coordinate value.


glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, fb1);
glReadPixels(x, height-y, 1, 1, GL_RGB, GL_FLOAT, value);
glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, 0);
gm.mousePos.x = (float)value[0]*800.0 - 400.0;
gm.mousePos.y = (float)value[1]*800.0 - 400.0;

This solution might be very simple but notice that keeping a 3 time 16 bit (2 byte) value of a screen with approximately 2 Million pixel is expensive memory-wise. It will occupy 6 MB of VGA memory. It might be a lot of waste for one function, but perhaps this buffer can be used later on for other function such as rendering. If you have any question regarding the code, feel free to email me. 

PS: Screen/Video capture coming soon :)

No comments: