How do all the technical components of VR headsets, e.g., screens, lenses, tracking, etc., actually come together to create realistic-looking virtual environments? Specifically, why do virtual environment in VR look more “real” compared to when viewed via other media, for example panoramic video?
The reason I’m bringing this up again is that the question keeps getting asked, and that it’s really kinda hard to answer. Most attempts to answer it fall back on technical aspects, such as stereoscopy, head tracking, etc., but I find that this approach somewhat misses the point by focusing on individual components, or at least gets mired in technical details that don’t make much sense to those who have to ask the question in the first place.
I prefer to approach the question from the opposite end: not through what VR hardware produces, but instead through how the viewer perceives 3D objects and/or environments, and how either the real world on the one hand, or virtual reality displays on the other, create the appropriate visual input to support that perception.
The downside with that approach is that it doesn’t lend itself to short answers. In fact, last summer, I gave a 25 minute talk about this exact topic at the 2016 VRLA Summer Expo. It may not be news, but I haven’t linked this video from here before, and it’s probably still timely:
Yesterday, I attended the second annual Silicon Valley Virtual Reality Conference & Expo in San Jose’s convention center. This year’s event was more than three times bigger than last year’s, with around 1,400 attendees and a large number of exhibitors.
Unfortunately, I did not have as much time as I would have liked to visit and try all the exhibits. There was a printing problem at the registration desk in the morning, and as a result the keynote and first panel were pushed back by 45 minutes, overlapping the expo time; additionally, I had to spend some time preparing for and participating in my own panel on “VR Input” from 3pm-4pm.
The panel was great: we had Richard Marks from Sony (Playstation Move, Project Morpheus), Danny Woodall from Sixense (STEM), Yasser Malaika from Valve (HTC Vive, Lighthouse), Tristan Dai from Noitom (Perception Neuron), and Jason Jerald as moderator. There was lively discussion of questions posed by Jason and the audience. Here’s a recording of the entire panel:
One correction: when I said I had been following Tactical Haptics‘ progress for 2.5 years, I meant to say 1.5 years, since the first SVVR meet-up I attended. Brainfart. Continue reading →
I have briefly mentioned HoloLens, Microsoft’s upcoming see-through Augmented Reality headset, in a previous post, but today I got the chance to try it for myself at Microsoft’s “Build 2015” developers’ conference. Before we get into the nitty-gritty, a disclosure: Microsoft invited me to attend Build 2015, meaning they waived my registration fee, and they gave me, like all other attendees, a free HP Spectre x360 notebook (from which I’m typing right now because my vintage 2008 MacBook Pro finally kicked the bucket). On the downside, I had to take Amtrak and Bart to downtown San Francisco twice, because I wasn’t able to get a one-on-one demo slot on the first day, and got today’s 10am slot after some finagling and calling in of favors. I guess that makes us even. 😛
I’ve been getting a lot of questions about using the Rift DK2 under Linux with Vrui recently, so I figured I’d post a little progress report here instead of answering them individually.
The good news is that I have the DK2 working to the level of the DK1, i.e., I have orientational tracking, lens distortion correction, and chromatic aberration correction. I also have low persistence, but that came for free.
What I don’t have, and most probably won’t have until an official Linux SDK drops, is positional tracking. In order to replicate the work a team of computer vision experts at Oculus have been doing for the last year or so, I’d need a few clones and a time machine. That said, I am working on combining the DK1/DK2’s built-in IMU with other external tracking systems, such as Intersense IS-900 or NaturalPoint OptiTrack. That’s a much easier (but still tricky) problem, and would allow using the Rift as a headset for large-area VR. Probably not interesting for home users, but being able to walk around freely in an 18’x10’x7′ volume opens up entirely different VR applications.
I’m currently working hard on the next release of the Vrui toolkit (version 3.2-001), which will have at least the level of DK2 support that I have internally now (combined tracking might or might not make it, but that can already be faked, see 3D Video Capture With Three Kinects).
The reason why I’m not releasing right now is that I’m still trying to optimize the “user experience” by integrating the ideas I described in A Trip Down the Graphics Pipeline. The idea is that plugging in a Rift and starting a Vrui application should just work. I have most of that going; the only issue is telling OpenGL to sync to the vertical retrace on the Rift’s display, no matter what. Right now that can only be done via environment variable, and I’m looking for the right place in Vrui to set that variable from inside a program. It’s a work-around until Nvidia expose that functionality via their NV-CONTROL X extension, or, even better, via a GLX extension (are you listening, Nvidia?). Or, why not change the implementation of GLX_SGI_video_sync, which is already bound to a display and drawable, such that it always syncs to the first video controller servicing that drawable? Wouldn’t even require a specification change. Just an idea.
And last but not least, once I got the DK2 and its low-persistence screen working, I realized how cavalier I’ve been about low-level timing issues in Vrui. With screen-based VR and LCD-based HMDs it has simply never been an issue before, but now it’s pretty obvious. Good thing is, I think I have a handle on it.
In summary: it’ll be a little bit longer, but I’m on it. Will I be able to release before Oculus does their Linux SDK? Sure hope so! And just in case you think I’ve been sitting on my hands for the last six months: there are already about 300 large and small changes between 3.1-002 and 3.2-001.
And here is today’s unrelated picture:
Figure 1: New adventures in real estate speculation.
After some initial uncertainty, and accidentally raising a stink on reddit, I did manage to attend Oculus Connect last weekend after all. I guess this is what a birthday bash looks like when the feted is backed by Facebook and gets to invite 1200 of his closest friends… and yours truly! It was nice to run into old acquaintances, meet new VR geeks, and it is still an extremely weird feeling to be approached by people who introduce themselves as “fans.” There were talks and panels, but I skipped most of those to take in demos and mingle instead; after all, I can watch a talk on YouTube from home just fine. Oh, and there was also new mobile VR hardware to check out, and a big surprise. Let’s talk VR hardware. Continue reading →
I’ve recently received an Oculus Rift Development Kit Mk. II, and since I’m on Linux, there is no official SDK for me and I’m pretty much out there on my own. But that’s OK; it’s given me a chance to experiment with the DK2 as a black box, and investigate some ways how I could support it in my VR toolkit under Linux, and improve Vrui’s user experience while I’m at it. And I also managed to score a genuine Oculus VR Latency Tester, and did a set of experiments with interesting results. If you just want to see those results, skip to the end.
The Woes of Windows
If you’ve been paying attention to the Oculus subreddit since the first DK2s have been delivered to developers/enthusiasts, there is a common consensus that the user experience of the DK2 and the SDK that drives it could be somewhat improved. Granted, it’s a developer’s kit and not a consumer product, but even developers seem to be spending more time getting the DK2 to run smoothly, or run at all, than actually developing for it (or at least that’s the impression I get from the communal bellyaching).
I have talkedmanytimes about the importance of eye tracking for head-mounted displays, but so far, eye tracking has been limited to the very high end of the HMD spectrum. Not anymore. SensoMotoric Instruments, a company with around 20 years of experience in vision-based eye tracking hardware and software, unveiled a prototype integrating the camera-based eye tracker from their existing eye tracking glasses with an off-the-shelf Oculus Rift DK1 HMD (see Figure 1). Fortunately for me, SMI were showing their eye-tracked Rift at the 2014 Augmented World Expo, and offered to bring it up to my lab to let me have a look at it.
Figure 1: SMI’s after-market modified Oculus Rift with one 3D eye tracking camera per eye. The current tracking cameras need square cut-outs at the bottom edge of each lens to provide an unobstructed view of the user’s eyes; future versions will not require such extensive modifications.
My friend Serban got his Oculus Rift dev kit in the mail today, and he called me over to check it out. I will hold back a thorough evaluation until I get the Rift supported natively in my own VR software, so that I can run a direct head-to-head comparison with my other HMDs, and also my screen-based holographic display systems (the head-tracked 3D TVs, and of course the CAVE), using the same applications. Specifically, I will use the Quake ||| Arena viewer to test the level of “presence” provided by the Rift; as I mentioned in my previous post, there are some very specific physiological effects brought out by that old chestnut, and my other HMDs are severely lacking in that department, and I hope that the Rift will push it close to the level of the CAVE. But here are some early impressions.
Figure 1: What it would look like to unbox an Oculus VR dev kit, if one were to have such a thing.
I don’t think there’s need to introduce the Oculus Rift HMD. Everyone’s heard of it, and everyone’s psyched – including me.
However, HMDs are prone to certain issues, and while that shouldn’t detract us from embracing them, we should be careful to do it right this time. The last thing the VR field needs right now is a viral YouTube video along the lines of “Oh, an Oculus Rift! Cool! Let me try it on… Wow, that’s awesoBLEEAAARRGHHH.”
To back up a little: when HMDs became a thing in the 80s, they tended to induce dizziness and nausea in viewers, after a relatively short time of using them. Interestingly, HMDs had generally worse effects than other types of immersive display environments such as CAVEs. The basic theory of simulation sickness is based on virtual motion, and does not account for this difference.
The commonly stated explanation for this difference is display lag. In an HMD, the screens move with the viewer’s head, and any delay will cause the virtual world to move along with the viewer until the display system catches up. Imagine wearing an HMD and quickly turning your head to the side. Say it takes 30ms total until this motion is noticed by the head tracking system, the application updates its internal state, renders the new state, and refreshes the HMD’s screens. During this interval, the world will turn with you, and it will snap back to its original orientation once the delay time has passed. The real world does not behave like that, and because HMD-based graphics tap deeply into our brain’s visual system, this is very disorienting and adds to the discomfort. In a CAVE, on the other hand, the screens do not move with the viewer. Delay will still cause a disturbance in the projection of the virtual world, as the actual viewer position will not match the virtual one, but because screens are large and relatively far away, this will be barely noticeable. So far, so good.
Alas, there is an additional, often overlooked, factor — display calibration. Any immersive graphics system, HMD or CAVE or else, needs to exactly replicate how virtual objects are projected onto the system’s real screens, and then seen by the user (how exactly that works is a topic for another post). The bottom line is that the graphics software needs to know the absolute positions and orientations of all screens, and the absolute positions of the viewer’s eyes. Determining this is the job of head tracking and system calibration. But in an HMD, unlike in a CAVE, the tolerances for calibration are very low. The screens are very small and very close to the viewer’s eyes, which doesn’t leave much room for error (see Figure 1). Even worse, there is no way to precisely don an HMD short of putting screws into one’s skull; every time you put it on, it sits slightly differently. And that means any pre-configured projection parameters will not match reality.
Figure 1: Diagram of a hypothetical HMD for calibration purposes. The HMD consists of small real screen mounted directly in front of the viewer’s eyes, and uses optics to create larger virtual screens at a longer distance away to allow users to properly focus on those screens. For proper calibration, graphics software needs to know the precise positions of the viewer’s pupils and the exact positions and sizes of the virtual screens, in some coordinate system. Head tracking will provide the mapping from this viewer-attached coordinate system to the world coordinate system to allow users to look and walk around.
These mismatches have several effects. For one, imagine that a viewer wears an HMD slightly askew, so that the two screens have different vertical positions in front of their respective eyes. If the software does not account for that, the two stereo images will be vertically displaced, something that does not happen in real life. The viewer’s eyes will make up for it, up to a point, by moving up/down independently, but that is an unnatural motion and causes eye strain. It’s the same effect as watching a 3D movie in a theater while not holding one’s head level — it will hurt later.
Another, more subtle, effect is that in a miscalibrated display system the virtual world does not behave as the real world would. Do a simple experiment: fire up some first-person video game that allows view configuration, such as Doom3, and set a high field of view. Then rotate the view and observe. The virtual world will display a strong distortion effect, meaning that the sizes of objects, and their internal angles, change as the viewpoint changes. This is an extreme example, but even slight discrepancies are subconsciously unsettling, because our visual system is very good at detecting if something is not right with the world, and it tells us that by making us sick.
Even in non-immersive 3D graphics, a too large discrepancy between real field of view (how large the screen looms in our visual field) and programmatic field-of-view is known to cause motion sickness, and immersive 3D graphics with the same issue will be much worse. FOV discrepancy is only one symptom of miscalibration, but it’s the one that’s easiest to demonstrate; the others are more subtle (but that’s a topic for another post). In the end, miscalibration is a nasty problem because it is subtle, very hard to correct, and causes significant ill effects.
I noticed these things when I started experimenting with my own HMDs a while ago (I have an eMagin Z800 3DVisor and a Sony HMZ-T1). I experimented with rapid motions, but those didn’t really make me dizzy. I did notice, however, that the world didn’t seem solid, but as if it was made from jelly. I expected that, not having done proper calibration yet, so I used an interactive calibration utility to set up the system just so. After that, the world seemed stable, and interestingly I didn’t notice any more issues from lag. Not having done any further experiments, my hunch is that miscalibration is actually a bigger problem than lag. (Disclosure: while I was using a low-latency Intersense IS-900 tracking system, the computer running the show was fairly old, and the Quake3 renderer had no particular performance tweaking, so I estimated total system delay around 30ms).
So what’s the take-home message from this wall of text? If we want HMDs to succeed, we need to treat them properly in our graphics software. We need to use proper projection models instead of the standard camera model (but that’s a topic for another post), and not simply apply ad-hoc stereo models such as toe-in etc. (but that’s a topic for another post). It might work for a demo, but it won’t be pretty, and it will make our users sick. Instead, we need to know exactly how the HMD is laid out internally (screen placement and size, effects from the optical system in front of the screens, lens distortions, etc.), and, just as importantly, we need to know exactly where the viewer’s eyes are with respect to the screens (see Figure 1). This last one is the hard part. Maybe a future perfect HMD will contain one pair of stereo cameras per screen that will accurately track the viewer’s pupils and allow the graphics software to set up the projection parameters correctly, no matter how the HMD is worn and how the viewer moves. But until then, we need to come up with a practical approach, and we need to find simple methods to calibrate HMDs on the fly, and teach our users how to use those methods.
Well, and, of course, we mustn’t forget about minimizing lag, either. That would be too easy.
Oh, and by the way, want to get a quick glimpse of just how immersive the Oculus Rift will be (going by current specs)? If your monitor is X inches wide, put your eye X/2 inches in front of the monitor’s center — that’s about what it will look like. If you want to play a first-person game from that viewpoint and have it look right, set the horizontal field of view to 90 degrees.