Why would GLCapabilities.setHardwareAccelerated(true/false) have no effect on performance?
- by Luke
I've got a JOGL application in which I am rendering 1 million textures (all the same texture) and 1 million lines between those textures. Basically it's a ball-and-stick graph.
I am storing the vertices in a vertex array on the card and referencing them via index arrays, which are also stored on the card. Each pass through the draw loop I am basically doing this:
gl.glBindBuffer(GL.GL_ARRAY_BUFFER, <buffer id>);
gl.glBindBuffer(GL.GL_ELEMENT_ARRAY_BUFFER, <buffer id>);
gl.glDrawElements(GL.GL_POINTS, <size>, GL.GL_UNSIGNED_INT, 0);
gl.glBindBuffer(GL.GL_ARRAY_BUFFER, <buffer id>);
gl.glBindBuffer(GL.GL_ELEMENT_ARRAY_BUFFER, <buffer id>);
gl.glDrawElements(GL.GL_LINES, <size>, GL.GL_UNSIGNED_INT, 0);
I noticed that the JOGL library is pegging one of my CPU cores. Every frame, the run method internal to the library is taking quite long. I'm not sure why this is happening since I have called setHardwareAccelerated(true) on the GLCapabilities used to create my canvas.
What's more interesting is that I changed it to setHardwareAccelerated(false) and there was no impact on the performance at all.
Is it possible that my code is not using hardware rendering even when it is set to true? Is there any way to check?
EDIT:
As suggested, I have tested breaking my calls up into smaller chunks. I have tried using glDrawRangeElements and respecting the limits that it requests. All of these simply resulted in the same pegged CPU usage and worse framerates.
I have also narrowed the problem down to a simpler example where I just render 4 million textures (no lines). The draw loop then just doing this:
gl.glEnableClientState(GL.GL_VERTEX_ARRAY);
gl.glEnableClientState(GL.GL_INDEX_ARRAY);
gl.glClear(GL.GL_COLOR_BUFFER_BIT | GL.GL_DEPTH_BUFFER_BIT);
gl.glMatrixMode(GL.GL_MODELVIEW);
gl.glLoadIdentity();
<... Camera and transform related code ...>
gl.glEnableVertexAttribArray(0);
gl.glEnable(GL.GL_TEXTURE_2D);
gl.glAlphaFunc(GL.GL_GREATER, ALPHA_TEST_LIMIT);
gl.glEnable(GL.GL_ALPHA_TEST);
<... Bind texture ...>
gl.glBindBuffer(GL.GL_ARRAY_BUFFER, <buffer id>);
gl.glBindBuffer(GL.GL_ELEMENT_ARRAY_BUFFER, <buffer id>);
gl.glDrawElements(GL.GL_POINTS, <size>, GL.GL_UNSIGNED_INT, 0);
gl.glDisable(GL.GL_TEXTURE_2D);
gl.glDisable(GL.GL_ALPHA_TEST);
gl.glDisableVertexAttribArray(0);
gl.glFlush();
Where the first buffer contains 12 million floats (the x,y,z coords of the 4 million textures) and the second (element) buffer contains 4 million integers. In this simple example it is simply the integers 0 through 3999999.
I really want to know what is being done in software that is pegging my CPU, and how I can make it stop (if I can).
My buffers are generated by the following code:
gl.glBindBuffer(GL.GL_ARRAY_BUFFER, <buffer id>);
gl.glBufferData(GL.GL_ARRAY_BUFFER, <size>
* BufferUtil.SIZEOF_FLOAT, <buffer>,
GL.GL_STATIC_DRAW);
gl.glVertexAttribPointer(0, 3, GL.GL_FLOAT, false, 0, 0);
and:
gl.glBindBuffer(GL.GL_ELEMENT_ARRAY_BUFFER, <buffer id>);
gl.glBufferData(GL.GL_ELEMENT_ARRAY_BUFFER,
<size> * BufferUtil.SIZEOF_INT,
<buffer>, GL.GL_STATIC_DRAW);
ADDITIONAL INFO:
Here is my initialization code:
gl.setSwapInterval(1); //Also tried 0
gl.glShadeModel(GL.GL_SMOOTH);
gl.glClearDepth(1.0f);
gl.glEnable(GL.GL_DEPTH_TEST);
gl.glDepthFunc(GL.GL_LESS);
gl.glHint(GL.GL_PERSPECTIVE_CORRECTION_HINT, GL.GL_FASTEST);
gl.glPointParameterfv(GL.GL_POINT_DISTANCE_ATTENUATION,
POINT_DISTANCE_ATTENUATION, 0);
gl.glPointParameterfv(GL.GL_POINT_SIZE_MIN, MIN_POINT_SIZE, 0);
gl.glPointParameterfv(GL.GL_POINT_SIZE_MAX, MAX_POINT_SIZE, 0);
gl.glPointSize(POINT_SIZE);
gl.glTexEnvf(GL.GL_POINT_SPRITE, GL.GL_COORD_REPLACE, GL.GL_TRUE);
gl.glEnable(GL.GL_POINT_SPRITE);
gl.glClearColor(clearColor.getX(), clearColor.getY(),
clearColor.getZ(), 0.0f);
Also, I'm not sure if this helps or not, but when I drag the entire graph off the screen, the FPS shoots back up and the CPU usage falls to 0%. This seems obvious and intuitive to me, but I thought that might give a hint to someone else.