One of the mysteries of the modern age is the existence of two distinct lines of graphics cards by the two big manufacturers, Nvidia and ATI/AMD. There are gamer-level cards, and professional-level cards. What are their differences? Obviously, gamer-level cards are cheap, because the companies face stiff competition from each other, and want to sell as many of them as possible to make a profit. So, why are professional-level cards so much more expensive? For comparison, an “entry-level” $700 Quadro 4000 is significantly slower than a $530 high-end GeForce GTX 680, at least according to my measurements using several Vrui applications, and the closest performance-equivalent to a GeForce GTX 680 I could find was a Quadro 6000 for a whopping $3660. Granted, the Quadro 6000 has 6GB of video RAM to the GeForce’s 2GB, but that doesn’t explain the difference.
So, is the Quadro a lot faster than the GeForce? According to web lore, it is, by a large factor of 10x or so (oh how I wish I could find a link to a benchmark right now, but please read on). But wait: the quoted performance difference is for “professional” or “workstation” applications. What are those? Well, CAD, obviously. What’s common in CAD? Wireframe rendering and double-sided surfaces. Could it be that these two features were specifically targeted for crippling in the GeForce driver, because they are so common to “workstation” applications, and so rarely used in games, to justify the Quadro’s price tags? No, right?
Well, I don’t have a CAD application at hand, but I have 3D Visualizer, which uses double-sided surfaces to render contour- or iso-surfaces in 3D volumetric data. From way back, on a piddly GeForce 3, I know that frame rate drops precipitously when enabling double-sided surfaces (for those who don’t know: double-sided surfaces are not backface-culled, and are illuminated from both sides, often with different material properties on either side). I don’t recall exactly, but the difference was significant but not outrageous, say a factor of 2-3. Makes sense, considering that OpenGL’s lighting engine has to do twice the amount of work. On a Quadro, the performance difference used to be, and still is, negligible. Makes sense as well; on special “professional” silicon, the two lighting calculations would be run in parallel.
But a couple of years ago, I got a rude awakening. On a GeForce 285, 3D Visualizer’s isosurfaces ran perfectly OK. Then an external user upgraded to a GeForce 480, and the bottom fell out. Isosurfaces that rendered on the 285 at a sprightly 60 or so FPS, rendered on the 480 at a sluggish 15 FPS. That makes no sense, since everything else was significantly faster on a 480. What makes even less sense is that double-sided surfaces were a factor of 13 slower than single-sided surfaces (remember the claimed 10x Quadro speed-up I mentioned above?).
Now think about that. One way of simulating double-sided surfaces is to render single-sided surfaces twice, with triangle orientations and normal vectors flipped. That would obviously take about twice as long. So where does the ginormous factor of 13 come from? There must be some feature specific to the GeForce’s hardware that makes double-sided surfaces slow. Turns out, there isn’t.
Double-sided surfaces are slow when rendered in the “standard” OpenGL way, i.e., by enabling glLightModeli(GL_LIGHT_MODEL_TWO_SIDE,GL_TRUE). That’s how legacy CAD software would do it. However, if an application implements the exact same formulas used for double-sided lighting by fixed-function OpenGL in a shader program, the difference evaporates. Suddenly, the GeForce is just as fast as the Quadro. I didn’t believe this myself, or expected it. I created a double-sided surface shader to go from a 13x penalty to a 2x penalty (remember, twice as many calculations), but I got only a 1.1-1.3 penalty, depending on overdraw. To restate this clearly: if implemented via a shader, double-sided surfaces on a GeForce are exactly as fast as on a Quadro; using fixed-function OpenGL, they are 13 times slower.
But that’s surely an accident, right? Double-sided surfaces are extremely uncommon in the GeForce’s target applications, so Nvidia simply wouldn’t have spent any effort optimizing that code path, right? But that would only explain a 2x penalty, which would come from the laziest most backwards implementation, i.e., rendering everything twice (I actually thought about implementing that via a geometry shader, if the more direct approach hadn’t worked). One wonders.
Let’s just, playing Devil’s advocate, assume for a moment that double-sided surface performance is intentionally crippled. Then wouldn’t the shader-based equivalent be crippled, as well? To paraphrase Cave Johnson, they’re not banging rocks together over there! Well, turns out, the GLSL shading language is practically Turing-complete. Oh, how I love to apply my background in theoretical information science! Turing-completeness, roughly speaking, means that it is impossible for software to detect the intent of a piece of code. Meaning there is no way the GeForce driver can look at a shader program and say Hey! Wait! It’s calculating double-sided illumination, let me slow it way down! So there you go.
But any way, why am I griping about this now? Because of the Quadro’s other selling point, frame-sequential stereo, which is obviously close to my heart. Legacy professional applications use quad-buffered stereo, if any stereo at all, so clearly that’s a feature not supported by GeForces (the way how exactly GeForces enable stereo for games, but disable it for “serious applications” in Nvidia 3D Vision under Windows is yet another funny story).
But we’ve been using GeForces for years to generate stereo content for 3D TVs. Instead of frame-sequential stereo, 3D TVs use HDMI-standardized stereo modes, and those can be “faked” by a cleverly configured GeForce. Specifically, none of the HDMI modes require the extra stereo synchronization cable that Quadros have, but GeForces don’t. I’ve recently tried convincing manufacturers of desktop 3D monitors to support HDMI stereo as well, to reduce the total cost of such systems. And the reaction to my concern is typically “well, we’re aiming for professional applications, and for those Quadros are 10x faster, so…”
And here’s the punchline: on a brand-new GeForce GTX 680, the fixed-functionality performance difference between single-sided and double-sided surfaces is 33x. Using a shader, it’s somewhere around 1.2x. Surely, you can’t be serious?
Finally, to sprinkle some highly needed relevance onto this post: the upcoming Oculus Rift HMD will expect stereo frames in side-by-side format, meaning it will work with GeForces without any problems.