UPDATE
As datenwolf points out my explanation is tied to the OpenGL 2 pipeline, which has been superseded. This means you have to do your own manipulation from world space into screen space if you want to eschew the deprecated methods. Sadly, this little footnote hasn't gotten around to being attached to every last bit of OpenGL sample code or commentary in the universe yet.
Of course I don't know why it's necessarily a bad thing to use the existing GL2 pipeline before picking a library to do the same or building one yourself.
ORIGINAL
I'm playing around with JOGL myself, though I have some limited prior experience with OpenGL. OpenGL uses two matrices to transform all the 3D points you pass through it from 3D model space into 2D screen space, the Projection matrix and the ModelView matrix.
The projection matrix is designed to compensate for the translation between the 3D world and the 2D screen, projecting a higher dimensional space onto a lower dimensional one. You can get lots more details by Googling gluPerspective, which is a function in the glut toolkit for setting that matrix.
The ModelView1 matrix on the other hand is responsible for translating 3D coordinates items from scene space into view (or camera) space. How exactly this is done depends on how you're representing the camera. Three common ways of representing the camera are
- A vector for the position, a vector for the target of the camera, and a vector for the 'up' direction
- A vector for the position plus a quaternion for the orientation (plus perhaps a single floating point value for scale, or leave scale set to 1)
- A single 4x4 matrix containing position, orientation and scale
Whichever one you use will require you to write code to translate the representation into something you can give to the OpenGL methods to set up the ModelView matrix, as well as writing code than translates user actions into modifications to the Camera data.
There are a number of demos in JOGL-Demos and JOCL-Demos that involve this kind of manipulation. For instance, this class is designed to act as a kind of primitive camera which can zoom in and out and rotate around the origin of the scene, but cannot turn otherwise. It's therefore represented as only 3 floats: and X and Y rotation and a Z distance. It applies its transform to the Modelview something like this2:
gl.glMatrixMode(GL2.GL_MODELVIEW);
gl.glLoadIdentity();
gl.glTranslatef(0, 0, z);
gl.glRotatef(rotx, 1f, 0f, 0f);
gl.glRotatef(roty, 0f, 1.0f, 0f);
I'm currently experimenting with a Quaternion+Vector+Float based camera using the Java Vecmath library, and I apply my camera transform like this:
Quat4d orientation;
Vector3d position;
double scale;
...
public void applyMatrix(GL2 gl) {
Matrix4d matrix = new Matrix4d(orientation, position, scale);
double[] glmatrix = new double[] {
matrix.m00, matrix.m10, matrix.m20, matrix.m30,
matrix.m01, matrix.m11, matrix.m21, matrix.m31,
matrix.m02, matrix.m12, matrix.m22, matrix.m32,
matrix.m03, matrix.m13, matrix.m23, matrix.m33,
};
gl.glMatrixMode(GL2.GL_MODELVIEW);
gl.glLoadMatrixd(glmatrix, 0);
}
1: The reason it's called the ModelView and not just the View matrix is because you can actually push and pop matrices on the ModelView stack (this is true of all OpenGL transformation matrices I believe). Typically you either have a full stack of matrices representing various transformations of items relative to one another in the scene graph, with the bottom one representing the camera transform, or you have a single camera transform and keep everything in the scene graph in world space coordinates (which kind of defeats the point of having a scene graph, but whatever).
2: In practice you wouldn't see the calls to gl.glMatrixMode(GL2.GL_MODELVIEW); in the code because the GL state machine is simply left in MODELVIEW mode all the time unless you're actively setting the projection matrix.