# 2D Desktop Embedding via VNC

There have been several discussions on the Oculus subreddit recently about how to integrate 2D desktops or 2D applications with 3D VR environments; for example, how to check your Facebook status while playing a game in the Oculus Rift without having to take off the headset.

This is just one aspect of the larger issue of integrating 2D and 3D applications, and it reminded me that it was about time to revive the old VR VNC client that Ed Puckett, an external contractor, had developed for the CAVE a long time ago. There have been several important changes in Vrui since the VNC client was written, especially in how Vrui handles text input, which means that a completely rewritten client could use the new Vrui APIs instead of having to implement everything ad-hoc.

Here is a video showing the new VNC client in action, embedded into LiDAR Viewer and displayed in a desktop VR environment using an Oculus Rift HMD, mouse and keyboard, and a Razer Hydra 6-DOF input device:

# Small Correction to Rift’s Projection Matrix

In a previous post, I looked at the Oculus Rift’s internal projection in detail, and did some analysis of how stereo rendering setup is explained in the Rift SDK’s documentation. Looking at that again, I noticed something strange.

In the other post, I simplified the Rift’s projection matrix as presented in the SDK documentation to

$P = \begin{pmatrix} \frac{2 \cdot \mathrm{EyeToScreenDistance}}{\mathrm{HScreenSize} / 2} & 0 & 0 & 0 \\ 0 & \frac{2 \cdot \mathrm{EyeToScreenDistance}}{\mathrm{VScreenSize}} & 0 & 0 \\ 0 & 0 & \frac{z_\mathrm{far}}{z_\mathrm{near} - z_\mathrm{far}} & \frac{z_\mathrm{far} \cdot z_\mathrm{near}}{z_\mathrm{near} - z_\mathrm{far}} \\ 0 & 0 & -1 & 0 \end{pmatrix}$

which, to those in the know, doesn’t look like a regular OpenGL projection matrix, such as created by glFrustum(…). More precisely, the third row of P is off. The third-column entry should be $\frac{z_\mathrm{near} + z_\mathrm{far}}{z_\mathrm{near} - z_\mathrm{far}}$ instead of $\frac{z_\mathrm{far}}{z_\mathrm{near} - z_\mathrm{far}}$, and the fourth-column entry should be $2 \cdot \frac{z_\mathrm{far} \cdot z_\mathrm{near}}{z_\mathrm{near} - z_\mathrm{far}}$ instead of $\frac{z_\mathrm{far} \cdot z_\mathrm{near}}{z_\mathrm{near} - z_\mathrm{far}}$. To clarify, I didn’t make a mistake in the derivation; the matrix’s third row is the same in the SDK documentation.

What’s the difference? It’s subtle. Changing the third row of the projection matrix doesn’t change where pixels end up on the screen (that’s the good news). It only changes the z, or depth, value assigned to those pixels. In a standard OpenGL frustum matrix, 3D points on the near plane get a depth value of 1.0, and those on the far plane get a depth value of -1.0. The 3D clipping operation that’s applied to any triangle after projection uses those depth values to cut off geometry outside the view frustum, and the viewport projection after that will map the [-1.0, 1.0] depth range to [0, 1] for z-buffer hidden surface removal.

Using a projection matrix as presented in the previous post, or in the SDK documentation, will still assign a depth value of -1.0 to points on the far plane, but a depth value of 0.0 to points on the (nominal) near plane. Meaning that the near plane distance given as parameter to the matrix is not the actual near plane distance used by clipping and z buffering, which might lead to some geometry appearing in the view that shouldn’t, and a loss of resolution in the z buffer because only half the value range is used.

I’m assuming that this is just a typo in the Oculus SDK documentation, and that the library code does the right thing (I haven’t looked).

Oh, right, so the fixed projection matrix, for those working along, is

$P = \begin{pmatrix} \frac{2 \cdot \mathrm{EyeToScreenDistance}}{\mathrm{HScreenSize} / 2} & 0 & 0 & 0 \\ 0 & \frac{2 \cdot \mathrm{EyeToScreenDistance}}{\mathrm{VScreenSize}} & 0 & 0 \\ 0 & 0 & \frac{z_\mathrm{near} + z_\mathrm{far}}{z_\mathrm{near} - z_\mathrm{far}} & 2 \cdot \frac{z_\mathrm{far} \cdot z_\mathrm{near}}{z_\mathrm{near} - z_\mathrm{far}} \\ 0 & 0 & -1 & 0 \end{pmatrix}$

# Game Engines and Positional Head Tracking

Oculus recently presented the “Crystal Cove,” a version of the Rift head-mounted display with built-in optical tracking, which is combined with the existing inertial tracker to provide a full 6-DOF (position and orientation) tracking solution at low latency, and it is rumored that the Crystal Cove will be released as development kit mark 2 after this year’s Game Developers Conference.

This is great news. I’ve been saying for a long time that Oculus cannot afford to drop positional head tracking on developers at the last minute, because it will break several assumptions built into game engines and other VR software (but let’s talk about game engines here). I’m also happy because the Crystal Cove uses precisely the tracking technology that I predicted: active markers (LEDs) on the headset, and an external camera placed at a fixed position in the environment. I am also sad because I didn’t manage to finish my own after-market optical tracking add-on before Oculus demonstrated their new integrated technology, but that’s life.

So why does positional head tracking break existing games? Because for the first time, the virtual camera used to render a game world is no longer under sufficient control of the software. Let’s take a step back. In a standard, desktop, 3D game, the camera is entirely controlled by the software. The software sets it to some position and orientation determined by the game logic, the 3D engine renders the virtual world for that camera setup, and the result is the displayed image.

# A Follow-up on Eye Tracking

Now this is why I run a blog. In my video and post on the Oculus Rift’s internals, I talked about distortions in 3D perception when the programmed-in camera positions for the left and right stereo views don’t match the current left and right pupil positions, and how a “perfect” HMD would therefore need a built-in eye tracker. That’s still correct, but it turns out that I could have done a much better job approximating proper 3D rendering when there is no eye tracking.

This improvement was pointed out by a commenter on the previous post. TiagoTiago asked if it wouldn’t be better if the virtual camera were located at the centers of the viewer’s eyeballs instead of at the pupils, because then light rays entering the eye straight on would be represented correctly, independently of eye vergence angle. Spoiler alert: he was right. But I was skeptical at first, because, after all, that’s just plain wrong. All light rays entering the eye converge at the pupil, and therefore that’s the only correct position for the virtual camera.

Well, that’s true, but if the current pupil position is unknown due to lack of eye tracking, then the only correct thing turns into just another approximation, and who’s to say which approximation is better. My hunch was that the distortion effects from having the camera in the center of the eyeballs would be worse, but given that projection in an HMD involving a lens is counter-intuitive, I still had to test it. Fortunately, adding an interactive foveating mechanism to my lens simulation application was simple.

Turns out that I was wrong, and that in the presence of a collimating lens, i.e., a lens that is positioned such that the HMD display screen is in the lens’ focal plane, distortion from placing the camera in the center of the eyeball is significantly less pronounced than in my approach. Just don’t ask me to explain it for now — it’s due to the “special properties of the collimated light.” 🙂

# Elon Musk discovers AR/VR

Serial entrepreneur Elon Musk posted this double whammy of cryptic messages to his Twitter account on August 23rd:

@elonmusk: We figured out how to design rocket parts just w hand movements through the air (seriously). Now need a high frame rate holograph generator.

@elonmusk: Will post video next week of designing a rocket part with hand gestures & then immediately printing it in titanium

As there are no further details, and the video is now slightly delayed (per Twitter as of September 2nd: @elonmusk: Video was done last week, but needs more work. Aiming to publish link in 3 to 4 days.), it’s time to speculate! I was hoping to have seen the video by now, but oh well. Deadline is deadline.

First of all: what’s he talking about? My best guess is a free-hand, direct-manipulation, 6-DOF user interface for a 3D computer-aided design (CAD) program. In other words, something roughly like this (just take away the hand-held devices and substitute NURBS surfaces and rocket parts for atoms and molecules, but leave the interaction method and everything else the same):

# Vrui on (in?) Oculus Rift

I wrote about my first impressions of the Oculus Rift developer kit back in April, and since then I’ve been working (on and off) on getting it fully and natively supported in Vrui (see Figure 1 for proof that it works). Given that Vrui’s somewhat insane flexibility is a major point of pride for me, what was it that I actually had to create to support the Rift? Turns out, not all that much: a driver for the Rift’s built-in inertial tracking unit and a post-processing filter to correct for the Rift’s lens distortion were all it took (more on that later). So why did it take me this long? For one, I was mostly working on other things and only spent a few hours here and there, but more importantly, the Rift is not just a new head-mounted display (HMD), but a major shift in how HMDs are (or will be) used.

Figure 1: The trademark “double-barrel” Oculus Rift screenshot, this time generated by a Vrui application.

# Behind the scenes: “Virtual Worlds Using Head-mounted Displays”

Virtual Worlds Using Head-mounted Displays” is the most complex video I’ve made so far, and I figured I should explain how it was done (maybe as a response to people who might say I “cheated”).

# The reality of head-mounted displays

So it appears the Oculus Rift is really happening. A buddy of mine went in early on the kickstarter, and his will supposedly be in the mail some time this week. In a way the Oculus Rift, or, more precisely, the most recent foray of VR into the mainstream that it embodies, was the reason why I started this blog in the first place. I’m very much looking forward to it (more on that below), but I’m also somewhat worried that the huge level of pre-release excitement in the gaming world might turn into a backlash against VR in general. So I made a video laying out my opinions (see Figure 1, or the embedded video below).

Figure 1: Still from a video describing how head-mounted displays should be used to create convincing virtual worlds.

# How Milo met CAVE

I just read an interesting article, a behind-the-scenes look at the infamous “Milo” demo Peter Molyneux did at 2009’s E3 to introduce Project Natal, i.e., Kinect.

This article is related to VR in two ways. First, the usual progression of overhyping the capabilities of some new technology and then falling flat on one’s face because not even one’s own developers know what the new technology’s capabilities actually are is something that should be very familiar to anyone working in the VR field.

But here’s the quote that really got my interest (emphasis is mine):

Others recall worrying about the presentation not being live, and thinking people might assume it was fake. Milo worked well, they say, but filming someone playing produced an optical illusion where it looked like Milo was staring at the audience rather than the player. So for the presentation, the team hired an actress to record a version of the sequence that would look normal on camera, then had her pretend to play along with the recording. … “We brought [Claire] in fairly late, probably in the last two or three weeks before E3, because we couldn’t get it to [look right]” says a Milo team member. “And we said, ‘We can’t do this. We’re gonna have to make a video.’ So she acted to a video. “Was that obvious to you?” Following Molyneux’s presentation, fans picked apart the video, noting that it looked fake in certain places.

Gee, sounds familiar? This is, of course, the exact problem posed by filming a holographic display, and a person inside interacting with it. In a holographic display, the images on the screens are generated for the precise point of view of the person using it, not the camera. This means it looks wrong when filmed straight up. If, on the other hand, it’s filmed so it looks right on camera, then the person inside will have a very hard time using it properly. Catch 22.

With the “Milo” demo, the problem was similar. Because the game was set up to interact with whoever was watching it, it ended up interacting with the camera, so to speak, instead of with the player. Now, if the Milo software had been set up with the level of flexibility of proper VR software, it would have been an easy fix to adapt the character’s gaze direction etc. to a filming setting, but since game software in the past never had to deal with this kind of non-rigid environment, it typically ends up fully vertically integrated, and making this tiny change would probably have taken months of work (that’s kind of what I meant when I said “not even one’s own developers know what the new technology’s capabilities actually are” above). Am I saying that Milo failed because of the demo video? No. But I don’t think it helped, either.

The take-home message here is that mainstream games are slowly converging towards approaches that have been embodied in proper VR software for a long time now, without really noticing it, and are repeating old mistakes. The Oculus Rift will really bring that out front and center. And I am really hoping it won’t fall flat on its face simply because software developers didn’t do their homework.

# Virtual clay modeling with 3D input devices

It’s funny, suddenly the idea of virtual sculpting or virtual clay modeling using 3D input devices is popping up everywhere. The developers behind the Leap Motion stated it as their inspiration to develop the device in the first place, and I recently saw a demo video; Sony has recently been showing it off as a demo for the upcoming Playstation 4; and I’ve just returned from an event at the Sacramento Hacker Lab, where Intel was trying to get developers excited about their version of the Kinect, or what they call “perceptual computing.” One of the demos they showed was — guess it — virtual sculpting (one other demo was 3D video of a person embedded into a virtual office, now where have I seen that before?)

So I decided a few days ago to dust off an old toy application (I showed it last in my 2007 Wiimote hacking video), a volumetric virtual “clay” modeler with real-time isosurface extraction for visualization, and run it with a Razer Hydra controller, which supports bi-manual 6-DOF interaction, a pretty ideal setup for this sort of thing: