# A Trip Down the Graphics Pipeline

I’ve recently received an Oculus Rift Development Kit Mk. II, and since I’m on Linux, there is no official SDK for me and I’m pretty much out there on my own. But that’s OK; it’s given me a chance to experiment with the DK2 as a black box, and investigate some ways how I could support it in my VR toolkit under Linux, and improve Vrui’s user experience while I’m at it. And I also managed to score a genuine Oculus VR Latency Tester, and did a set of experiments with interesting results. If you just want to see those results, skip to the end.

# The Woes of Windows

If you’ve been paying attention to the Oculus subreddit since the first DK2s have been delivered to developers/enthusiasts, there is a common consensus that the user experience of the DK2 and the SDK that drives it could be somewhat improved. Granted, it’s a developer’s kit and not a consumer product, but even developers seem to be spending more time getting the DK2 to run smoothly, or run at all, than actually developing for it (or at least that’s the impression I get from the communal bellyaching).

# On the road for VR: Silicon Valley Virtual Reality Conference & Expo

I just got back from the Silicon Valley Virtual Reality Conference & Expo in the awesome Computer History Museum in Mountain View, just across the street from Google HQ. There were talks, there were round tables, there were panels (I was on a panel on non-game applications enabled by consumer VR, livestream archive here), but most importantly, there was an expo for consumer VR hardware and software. Without further ado, here are my early reports on what I saw and/or tried.

Figure 1: Main auditorium during the “60 second” lightning pitches.

# 3D Video Capture with Three Kinects

I just moved all my Kinects back to my lab after my foray into experimental mixed-reality theater a week ago, and just rebuilt my 3D video capture space / tele-presence site consisting of an Oculus Rift head-mounted display and three Kinects. Now that I have a new extrinsic calibration procedure to align multiple Kinects to each other (more on that soon), and managed to finally get a really nice alignment, I figured it was time to record a short video showing what multi-camera 3D video looks like using current-generation technology (no, I don’t have any Kinects Mark II yet). See Figure 1 for a still from the video, and the whole thing after the jump.

Figure 1: A still frame from the video, showing the user’s real-time “holographic” avatar from the outside, providing a literal kind of out-of-body experience to the user.

# Quikwriting with a Thumbstick

In my previous post about gaze-directed Quikwriting I mentioned that the method should be well-suited to be mapped to a thumbstick. And indeed it is:

Using Vrui, implementing this was a piece of cake. Instead of modifying the existing Quikwrite tool, I created a new transformation tool that converts a two-axis analog joystick, e.g., a thumbstick on a game controller, to a virtual 6-DOF input device moving inside a flat square. Then, when binding the unmodified Quikwrite tool to that virtual input device, exactly the expected happens: the directions of the thumbstick translate 1:1 to the character selection regions of the Quikwrite square. I’m expecting that this new transformation tool will come in handy for other applications in the future, so that’s another benefit.

# Gaze-directed Text Entry in VR Using Quikwrite

Text entry in virtual environments is one of those old problems that never seem to get solved. The core issue, of course, is that users in VR either don’t have keyboards (because they are in a CAVE, say), or can’t effectively use the keyboard they do have (because they are wearing an HMD that obstructs their vision). To the latter point: I consider myself a decent touch typist (my main keyboard doesn’t even have key labels), but the moment I put on an HMD, that goes out the window. There’s an interesting research question right there — do typists need to see their keyboards in their peripheral vision to use them, even when they never look at them directly? — but that’s a topic for another post.

Until speech recognition becomes powerful and reliable enough to use as an exclusive method (and even then, imagining having to dictate “for(int i=0;i<numEntries&&entries[i].key!=searchKey;++i)” already gives me a headache), and until brain/computer interfaces are developed and we plug our computers directly into our heads, we’re stuck with other approaches.

Unsurprisingly, the go-to method for developers who don’t want to write a research paper on text entry, but just need text entry in their VR applications right now, and don’t have good middleware to back them up, is a virtual 3D QWERTY keyboard controlled by a 2D or 3D input device (see Figure 1). It’s familiar, straightforward to implement, and it can even be used to enter text.

Figure 1: Guilty as charged — a virtual keyboard in the Vrui toolkit, implemented as a GLMotif pop-up window with rows and columns of buttons.

# Someone at Oculus is Reading my Blog

I am getting the feeling that Big Brother is watching me. When I released the inital version of the Vrui VR toolkit with native Oculus Rift support, it had magnetic yaw drift correction, which the official Oculus SDK didn’t have at that point (Vrui doesn’t use the Oculus SDK at all to talk to the Rift; it has its own tracking driver that talks to the Rift’s inertial movement unit directly via USB, and does its own sensor fusion, and also does its own projection setup and lens distortion correction). A week or so later, Oculus released an updated SDK with magnetic drift correction.

A little more than a month ago, I wrote a pair of articles investigating and explaining the internals of the Rift’s display, and how small deviations in calibration have a large effect on the perceived size of the virtual world, and the degree of “solidity” (for lack of a better word) of the virtual objects therein. In those posts, I pointed out that a single lens distortion correction formula doesn’t suffice, because lens distortion parameters depend on the position of the viewers’ eyes relative to the lenses, particularly the eye/lens distance, otherwise known as “eye relief.” And guess what: I just got an email via the Oculus developer mailing list announcing the (preview) release of SDK version 0.3.1, which lists eye relief-dependent lens correction as one of its major features.

Maybe I should keep writing articles on the virtues of 3D pupil tracking, and the obvious benefits of adding an inertially/optically tracked 6-DOF input device to the consumer-level Rift’s basic package, and those things will happen as well. 🙂

# More on Desktop Embedding via VNC

I started regretting uploading my “Embedding 2D Desktops into VR” video, and the post describing it, pretty much right after I did it, because there was such an obvious thing to do, and I didn’t think of it.

Figure 1: Screenshot from video showing VR ProtoShop run simultaneously in a 3D environment created by an Oculus Rift and a Razer Hydra, and in a 2D environment using mouse and keyboard, brought into the 3D environment via the VNC remote desktop protocol.

# 2D Desktop Embedding via VNC

There have been several discussions on the Oculus subreddit recently about how to integrate 2D desktops or 2D applications with 3D VR environments; for example, how to check your Facebook status while playing a game in the Oculus Rift without having to take off the headset.

This is just one aspect of the larger issue of integrating 2D and 3D applications, and it reminded me that it was about time to revive the old VR VNC client that Ed Puckett, an external contractor, had developed for the CAVE a long time ago. There have been several important changes in Vrui since the VNC client was written, especially in how Vrui handles text input, which means that a completely rewritten client could use the new Vrui APIs instead of having to implement everything ad-hoc.

Here is a video showing the new VNC client in action, embedded into LiDAR Viewer and displayed in a desktop VR environment using an Oculus Rift HMD, mouse and keyboard, and a Razer Hydra 6-DOF input device:

# Small Correction to Rift’s Projection Matrix

In a previous post, I looked at the Oculus Rift’s internal projection in detail, and did some analysis of how stereo rendering setup is explained in the Rift SDK’s documentation. Looking at that again, I noticed something strange.

In the other post, I simplified the Rift’s projection matrix as presented in the SDK documentation to

$P = \begin{pmatrix} \frac{2 \cdot \mathrm{EyeToScreenDistance}}{\mathrm{HScreenSize} / 2} & 0 & 0 & 0 \\ 0 & \frac{2 \cdot \mathrm{EyeToScreenDistance}}{\mathrm{VScreenSize}} & 0 & 0 \\ 0 & 0 & \frac{z_\mathrm{far}}{z_\mathrm{near} - z_\mathrm{far}} & \frac{z_\mathrm{far} \cdot z_\mathrm{near}}{z_\mathrm{near} - z_\mathrm{far}} \\ 0 & 0 & -1 & 0 \end{pmatrix}$

which, to those in the know, doesn’t look like a regular OpenGL projection matrix, such as created by glFrustum(…). More precisely, the third row of P is off. The third-column entry should be $\frac{z_\mathrm{near} + z_\mathrm{far}}{z_\mathrm{near} - z_\mathrm{far}}$ instead of $\frac{z_\mathrm{far}}{z_\mathrm{near} - z_\mathrm{far}}$, and the fourth-column entry should be $2 \cdot \frac{z_\mathrm{far} \cdot z_\mathrm{near}}{z_\mathrm{near} - z_\mathrm{far}}$ instead of $\frac{z_\mathrm{far} \cdot z_\mathrm{near}}{z_\mathrm{near} - z_\mathrm{far}}$. To clarify, I didn’t make a mistake in the derivation; the matrix’s third row is the same in the SDK documentation.

What’s the difference? It’s subtle. Changing the third row of the projection matrix doesn’t change where pixels end up on the screen (that’s the good news). It only changes the z, or depth, value assigned to those pixels. In a standard OpenGL frustum matrix, 3D points on the near plane get a depth value of 1.0, and those on the far plane get a depth value of -1.0. The 3D clipping operation that’s applied to any triangle after projection uses those depth values to cut off geometry outside the view frustum, and the viewport projection after that will map the [-1.0, 1.0] depth range to [0, 1] for z-buffer hidden surface removal.

Using a projection matrix as presented in the previous post, or in the SDK documentation, will still assign a depth value of -1.0 to points on the far plane, but a depth value of 0.0 to points on the (nominal) near plane. Meaning that the near plane distance given as parameter to the matrix is not the actual near plane distance used by clipping and z buffering, which might lead to some geometry appearing in the view that shouldn’t, and a loss of resolution in the z buffer because only half the value range is used.

I’m assuming that this is just a typo in the Oculus SDK documentation, and that the library code does the right thing (I haven’t looked).

Oh, right, so the fixed projection matrix, for those working along, is

$P = \begin{pmatrix} \frac{2 \cdot \mathrm{EyeToScreenDistance}}{\mathrm{HScreenSize} / 2} & 0 & 0 & 0 \\ 0 & \frac{2 \cdot \mathrm{EyeToScreenDistance}}{\mathrm{VScreenSize}} & 0 & 0 \\ 0 & 0 & \frac{z_\mathrm{near} + z_\mathrm{far}}{z_\mathrm{near} - z_\mathrm{far}} & 2 \cdot \frac{z_\mathrm{far} \cdot z_\mathrm{near}}{z_\mathrm{near} - z_\mathrm{far}} \\ 0 & 0 & -1 & 0 \end{pmatrix}$